AI Response evaluation & RLHF Data Annotation Specialist
Worked on large-scale text annotation and AI response evaluation projects supporting large language model training. Assessed model-generated outputs based on structured rubrics, including accuracy, relevance, safety, reasoning quality, and policy compliance. Tasks involved comparing multiple responses, identifying hallucinations, flagging unsafe or misleading content, and providing structured ratings to improve model alignment. Operated within Snorkel AI's guideline-driven annotations, ensuring consistency across high-volume tasks while adapting to evolving project instructions. Maintained strong quality standards through calibration exercises, feedback loops, and strict adherence to evaluation criteria to support reliable model fine-tuning and performance optimization.