LLM Training & RLHF Subject Matter Expert (Ocado Technology)
Provided high-quality human feedback for Large Language Model (LLM) training focused on optimizing reasoning, factual accuracy, and brand-voice alignment. Led AI operations that structured instructional inputs and review data for advanced reinforcement learning systems. Utilized prompt engineering and supervised fine-tuning methods to directly improve model outputs for business-focused use cases. • Designed and executed data annotation workflows for business logic alignment. • Conducted thorough evaluation/rating of LLM generated responses. • Applied LLM evaluation, fact-checking, and prompt engineering for AI safety and hallucination mitigation. • Integrated feedback cycles using internal/proprietary tooling and Python-based review scripts.