Multilingual LLM & Audio Annotation Project for NLP Model Training
Worked on a multilingual AI training project involving text, audio, and video data annotation across platforms such as Appen and OneForma. Responsibilities included Named Entity Recognition (NER), sentiment classification, prompt-response generation, and QA evaluation for LLM outputs. Additionally, I performed manual transcription of noisy audio files and translated conversational data between Hindi and English. Evaluated LLM-generated responses for coherence, factuality, and relevance using detailed guidelines. Contributed to the fine-tuning of AI assistant behavior through prompt engineering and SFT writing. Maintained high annotation accuracy by following rigorous QA checks and client feedback loops. The project spanned over 10,000+ data points, with high client satisfaction and consistent benchmark quality scores above 95%.