Generalist AI Evaluator
As a Generalist AI Evaluator at Outlier, I assessed AI responses across various tasks for factual accuracy, instruction-following, tone, verbosity, and safety. My evaluations required applying detailed rubrics and justifying model rankings with evidence-based reasoning. Results were used as signals for reward model training to refine AI behaviors. • Ensured strict adherence to quality standards and guidelines. • Reviewed diverse task types and provided objective assessments. • Maintained low error rates and high-quality scores. • Contributed to the continuous improvement of AI model performance.