Audio Model Evaluator
Evaluated audio model responses for cultural nuance, grounding, and personalization. Designed multi-turn conversational prompts using personal context to test model adaptability in audio contexts. Provided structured rationales and feedback to refine AI model output alignment with user data. • Conducted side-by-side ranking of model outputs to identify performance differences. • Identified personalization errors, incorrect cultural facts, and technical inaccuracies. • Verified data source utilization and assessed integration quality in responses. • Evaluated proper use of personal data and identified flawed inferences or hallucinations.