AI Technical Trainer / SME (Subject Matter Expert) | outlier.ai
As an AI Technical Trainer and SME at Outlier.ai, I performed Reinforcement Learning from Human Feedback (RLHF) to enhance large language model accuracy in complex domains. I evaluated AI-generated text for factual accuracy, logical consistency, and safety adherence. I authored high-quality prompts and responses to train models in multi-turn reasoning and problem-solving. • RLHF applied on engineering, math, and physics-related texts • Focused on edge case identification, hallucination detection, and model ranking • Drafted “gold standard” ground truth data for LLM training • Ensured prompt and response alignment to safety and quality guidelines