AI Data Trainer & Annotator
As an AI Data Trainer at Outlier.ai, I evaluate and improve model outputs for Large Language Models (LLMs) using Reinforcement Learning from Human Feedback. I ensure data quality through detailed assessments, model comparison, and iterative feedback, always adhering to strict safety, neutrality, and factual accuracy guidelines. My efforts focus on identifying edge cases and enhancing reliability in AI-generated responses. • Rank and compare LLM responses based on quality and instruction-following. • Analyze and annotate complex prompt-response pairs for continuous improvement. • Apply rigorous quality control measures to detect hallucinations, bias, and inaccuracies. • Provide detailed feedback to optimize model safety and correctness.