AI Trainer | Prompt Engineering, RLHF, Model Evaluation (Outlier.ai)
Evaluated and refined LLM outputs using reinforcement learning from human feedback to enhance model accuracy and guardrails. Generated high-quality, domain-specific training datasets for foundational AI models. Designed and optimized prompt strategies for comprehensive model evaluation. • Improved model reasoning and safety through iterative RLHF cycles. • Built targeted edge-case scenarios to identify model hallucinations. • Enhanced contextual awareness by creating diverse training data. • Utilized model evaluation to inform iterative training and prompt optimization.