Multilingual Text Annotation for Conversational AI (Remotasks, 2023-2024)
The Multilingual Text Annotation Conversational AI project aimed to enhance the performance of a chatbot system with high-quality multilingual training data in Spanish and English. The scope was text data annotation of NLP tasks like named entity recognition (NER), intent classification, and sentiment analysis. I tagged over 10,000 words of text, including entities such as names, locations, and organizations, and user intent tagging (e.g., complaints, questions, bookings) to increase the contextualization of the chatbot. The work included precise marking up of conversational dialogues with the intention of maintaining cultural and linguistic integrity, particularly for corpora in Spanish-language. Quality control included adherence to annotation guidelines rigorously, having an inter-annotator agreement (IAA) score of 90% or above, and regular quality control through randomized sampling checks. I used a set of annotators to ensure uniformity and participated in bi-weekly calibration se