Google I/O 2023 has come and gone, but the chatter about generative AI is not over yet. AI progress is sweeping across every corner of our lives, leaving human work in its wake. What was once a futuristic fantasy has transformed into an unsettling reality.
It’s true — AI becomes better. And it makes more and more industries go fully automated, with practically no or minimum human involvement. The same is true for data annotation.
As the CEO of a data annotation company, with a dedicated team of hundreds of human experts (yes, we still keep it manual), I’m certain that AI isn’t a threat to the manual work we’ve been doing for years. Yet, with the ever-growing array of AI tools for automated data labeling, perhaps there is reason for concern.
You knew ChatGPT would be the starting point, right? It quickly became the talk of the town in the global tech community and beyond. And today, the debate is open on its potential for data labeling tasks and the impressive progress it could make on the job.
Can ChatGPT ever replace a data annotator? Cornell University’s research has compared the AI language model’s and crowd-workers’ performance on text annotation tasks involving Twitter data. The results were truly captivating: ChatGPT outperformed crowd-workers in four out of five tasks, and its intercoder agreement surpassed both crowd-workers and trained annotators across all tasks.
What’s more, ChatGPT achieved this at a per-annotation cost of less than $0.003, making it approximately twenty times cheaper than Amazon’s crowdsourcing marketplace, MTurk. These findings demonstrate the immense potential of large language models (LLMs) in reinvigorating text classification efficiency. However, I should note that both ChatGPT and MTurk crowd-workers were compared to trained human annotators, which was the gold standard in this research.
Nevertheless, can we place our trust solely on automated data annotation by ChatGPT and similar AI tools, completely replacing the need for manual annotation? Let me explain why I don’t see that happening any time soon, drawing from my experience of over a decade in the field.
It's still too early to say that AI will completely replace data annotators. Why? Well, human annotators possess some profound technical expertise and skills. Plus, they understand data on a whole different level and can interpret it in a way that reflects the real world accurately.
On the other hand, AI models designed for automated data labeling are inherently flawed and far from perfect. We’ve all seen this when using chatbots for work or personal stuff. They still have a long way to go before catching up with the human touch. For example, AI annotation tools for OCR (Optical Character Recognition) tasks can automatically label documents, select text within bounding boxes, and transcribe it.
At Label Your Data, a spin-off of SupportYourApp, the process of document processing involves large teams working as humans in the loop. This means that we receive pre-labeled data and our skilled team applies specific labels to different elements, such as identifying merchants or invoice IDs.
AI is beneficial for pre-annotation because it can quickly analyze and make initial suggestions for labeling tasks, providing a head start for the annotation process and saving time for human annotators. However, it cannot replace the extensive work carried out by my annotation team due to the fact that AI has been tested for a very narrow range of data labeling tasks.
Take ChatGPT for instance. It can only do a few specific annotation tasks. Because when we talk about the AI language model, we are dealing with the NLP (Natural Language Processing) tasks only and text data accordingly. More tellingly, when it comes to tasks like content moderation, ChatGPT is unable to provide accurate labeling due to the absence of emotional intelligence. A quality that AI systems currently lack and may not possess in the future.
This holds true for numerous other tasks that involve subjective judgment. So, that’s one more objection to ChatGPT and other AI tools potentially replacing manual annotation.
Now, let's discuss automated data annotation for visual data like images and videos, focusing on an AI tool called SAM (Segment Anything Model), developed by Meta. SAM was built using a dataset of over 11 million images and is designed for image segmentation. It can predict masks based on different input types such as bounding boxes, points, and text.
While my team manually handles this task, the model takes care of it automatically. Does this imply that manual data annotation is now obsolete and unnecessary? I can assure you that it is not the case. Allow me to explain why. AI models, like SAM, excel at processing only high-quality images. However, our annotators can label even the fuzziest photographs and video data. That’s the skill for AI to keep working on.
AI is increasingly being used in the field of programmatic QA (Quality Assurance) to automate and enhance various aspects of the testing process. It can test case generation, execute tests, detect defects, and analyze anomalies. This allows QA teams to focus on critical tasks and improve overall annotation quality.
However, AI tools for labeling and annotation, such as ChatGPT or SAM, can potentially misuse personal data in a few ways. Improper tool configuration or supervision may lead to unintended exposure or leakage of personal information during data annotation. If annotated data is retained or stored without adequate security measures, unauthorized access and breaches can occur. Aggregated or anonymized data used for model training can also pose privacy risks if re-identification is possible.
To address these concerns, robust data protection measures, such as anonymization, encryption, access controls, and audits, are essential for the responsible handling of personal data.
AI is stealing the show in the data annotation realm, but I bet you’ll agree with me if I say that AI is powerless without humans. My point remains steadfast amidst the hype surrounding generative AI. For example, SAM performs poorly in medical data annotation, meaning that it requires the involvement of a healthcare professional with medical background for this task. Hence, we at Label Your Data, always make sure to engage subject-matter experts for such annotation tasks.
Furthermore, the current research on automated data labeling is limited in its scope and only focuses on tweet classification or high-quality image segmentation, which is just a small part of AI. And it’s important to note that ChatGPT is still trained and supervised by human annotators to ensure the results are sensible and non-toxic.
Ironically, even in the pursuit of fully automated labeling, AI tools still rely on data annotation by humans for their training and testing. Computers lack the understanding of our evolving standards for appropriate and respectful speech or vision capabilities. That said, the existence of ChatGPT, and similar tools is still heavily reliant on human experts.
But let's not make sweeping claims. In my view, a hybrid data annotation model can be a game-changer when complemented by existing tools like ChatGPT. And an ultimate solution to get the best of both worlds in the age of AI. That’s because the involvement of humans in the loop remains crucial for validating and verifying these models. Their expertise is essential in creating top-notch data annotation and ensuring the reliability of AI.
With AI in the spotlight, it is evident that automated data annotation tools such as ChatGPT have a role to play. But they are far from replacing human annotators. Technical expertise, contextual understanding, and the ability to handle diverse and poor-quality datasets are skills that human annotators possess and remain invaluable.
AI tools have their limitations, particularly in tasks requiring subjective judgment and emotional intelligence. However, by complementing automated tools with human expertise, we can achieve more efficient and accurate annotations. The involvement of human annotators in validating and verifying the output of AI models ensures reliability and maintains the highest standards of data annotation.
In this age of AI, humans-in-the-loop remain crucial for bridging the gap between automation and the complexities of real-world data annotation.