AI Trainer, Outlier
As an AI Trainer at Outlier, I evaluated and ranked LLM outputs for quality and adherence to guidelines. I created prompts and benchmark answers to generate supervised fine-tuning data across multiple domains. I conducted dataset quality assurance, error analysis, and promoted annotation consistency. • Assessed LLM output helpfulness, correctness, safety, clarity, and instructions compliance. • Authored prompts and benchmarks for general knowledge, reasoning, business writing, and STEM. • Removed ambiguous prompts, standardized formats, and enforced rigorous annotation guidelines. • Identified and reported recurring model issues such as hallucinations or policy violations for model improvement.