AI Training Specialist (Freelance/Independent)
As an AI Training Specialist, I collaborated on reinforcement learning from human feedback (RLHF) projects to assess, rate, and inform the reward models for large language models (LLMs). I evaluated AI-generated outputs for truthfulness, safety, and helpfulness, and developed prompts to test and uncover model hallucinations. I provided logical justifications and fact-checked responses for technical and general correctness. • Evaluated LLM outputs using structured preference ranking guidelines. • Designed creative prompts to assess model limitations and edge cases. • Performed comprehensive hallucination detection and logical flow audits. • Contributed to reward model optimization via rich, labeled data sets.