Multimodal AI Training & Annotation (Audio, Image, Video, Text)
Currently contributing to multimodal AI training projects involving annotation and evaluation of audio, image, video, and text datasets. Tasks include speech transcription and quality review, image and video object detection/segmentation, multimodal content classification, and evaluation of AI-generated outputs across modalities. Also perform prompt–response authoring and rating for generative and conversational AI systems (RLHF/SFT workflows). Work spans diverse domains including everyday scenes, human activities, speech samples, and multimodal reasoning tasks. Maintain high annotation accuracy through guideline adherence, calibration rounds, and peer review to support robust model training and evaluation.