Behind the Bots: Meet the Human Experts Fixing AI’s Flaws

Chongqing - Chatbots can be impressive—until AI hallucinations deliver misleading but convincing answers. Experts say such errors will persist with current Large Language Models (LLMs) but can be reduced through improved training.

This has made AI trainers, or annotators, one of today's most sought-after cross-industry professions. At Digital Tianma, a new tech company based in Chongqing Liangjiang New Area, we met a group of such professionals.

Digital Tianma is located in the Liangjiang Digital Economy Industrial Park. (Photo/Chongqing Liangjiang New Area Industrial Operation Co., Ltd)

LLMs generate responses based on the data they’ve been trained on. When asked questions beyond that scope, they can produce inaccurate or misleading answers—commonly known as AI hallucinations. 

“Data deviation remains one of the primary causes of AI hallucinations,” said Zhang Shihao, head of the annotation service unit of Digital Tianma, a Ant Group subsidiary specializing in information and operation services.

According to Zhang, expanding and refining training data is key to reducing these errors. His annotation team tackles this by carefully collecting, cleaning, analyzing, and calibrating data to minimize deviations and improve accuracy.

“For instance, data cleansing involves refining massive raw datasets by correcting errors, removing duplicates, filling gaps, and ensuring consistency to improve usability.”                                                                                               

Zhang explained that the work of data annotators, known as AI trainers, is like that of teachers - imparting knowledge and building reasoning capabilities. AI training, computational power, and algorithms are the three pillars determining the quality of LLMs.

Well-trained models deliver more accurate responses. For instance, if a query contains typos, models trained for user intent recognition can infer the purpose from context and respond accordingly.

With eight years in AI training, Zhang has witnessed the field transform from labor-intensive to knowledge-driven.

Before 2022, training focused on general knowledge. For example, in the autonomous driving field, annotators would label street-view images, identifying crosswalks or vehicles. Those tasks required minimal expertise but vast manpower.

“The AI landscape has been evolving rapidly since 2022, and we need subject-matter experts for enhanced LLM training. The shift is reflected in our hiring, which values multi-skilled professionals, ” said Zhang.

Li Wenyuan, who joined Zhang’s team last August after nearly a decade in finance, said that deep integration of LLMs with domain-specific expertise relies on precise, specialized datasets.

Li works as a data annotator after nearly a decade in finance. (Photo/Li Wenyuan)

Li’s work mainly supports Ma Xiaocai, an online financial model developed by Ant Group. “After initial training, the model has obtained basic financial knowledge. Now, trainers need advanced expertise to refine it further. Their educational backgrounds and industry experience are critical,” Li explained.

Digital Tianma employs 5,000 annotators and has processed hundreds of millions of high-quality data entries since its establishment in 2023. As Zhang emphasizes, the AI revolution is beyond coding; equally important participants are trainers who blend technical skills with industry savvy.

The job market reflects the demand. Data from Zhaopin.com, a major recruitment platform in China, shows that demand for data annotators surged by over 50% year-on-year in February 2025.