Multimodal AI Evaluation & Data Annotation – Outlier
As a data annotator and AI evaluator on Outlier, I work on multimodal tasks involving text, images, audio/voice, and video. I assess AI-generated responses for correctness, coherence, safety, and adherence to detailed written guidelines, and I provide structured ratings and written feedback that can be used for model training and evaluation. My responsibilities include ranking and scoring multiple AI responses, identifying factual errors or policy violations, judging tone and helpfulness in conversations, and sometimes writing improved reference answers. I regularly handle edge cases, apply nuanced instructions consistently, and maintain high annotation quality over long sessions. This work has strengthened my ability to understand complex instructions, think critically about model behavior, and contribute directly to improving AI systems.