AI Training & Evaluation Specialist
Evaluated AI-generated responses for accuracy, safety, and helpfulness as part of ongoing improvements to Large Language Models (LLMs). Focused on ensuring high-quality data annotation and linguistic evaluation to support robust AI training. Performed expert validation on complex datasets and identified inconsistencies impacting AI output. • Developed guidelines for annotation consistency. • Designed evaluation protocols for model performance. • Collaborated with machine learning engineers to refine LLMs. • Conducted fact-checking and error analysis for model outputs.