Multimodal AI Training & High-Precision Data Annotation for LLM and Computer Vision
Led and executed large-scale multimodal data labeling initiatives supporting the training and fine-tuning of advanced AI and large language models. The project involved end-to-end annotation across text, image, and audio datasets to improve model accuracy, contextual understanding, and safety alignment. Scope of Work: Annotated and validated 500,000+ data points across multimodal datasets. Performed high-precision image labeling including bounding boxes, polygon segmentation, object detection, and visual classification. Executed audio annotation workflows such as speech-to-text transcription, speaker diarization, emotion tagging, and acoustic event detection. Conducted NER tagging and text classification for NLP pipelines supporting conversational AI systems. Delivered RLHF and evaluation/rating tasks to improve LLM response quality, factuality, and safety compliance. Produced prompt-response pairs (SFT) to strengthen model instruction-following behavior.