Human-in-the-Loop Evaluator (Freelance)
As a Human-in-the-Loop Evaluator, I contributed to the fine-tuning and safety auditing of AI models on the Feather/OpenAI platform. My responsibilities included evaluating model completions for instruction following, clarity, and technical accuracy, with a strong focus on high-stakes domains such as food safety and health. Consistently maintaining a high quality rating, I ensured logical analysis and rigorous attention to detail in every task. • Directly participated in RLHF evaluations and side-by-side Arena ranking tasks. • Identified model hallucinations and audited responses for safety and adherence to USDA/FDA standards. • Provided in-depth rationales when ranking model outputs for both accuracy and helpfulness. • Fact-checked and graded model completions for tone, clarity, and technical safety protocols.