SFT & RLHF Prompt Engineer (Google DeepMind)
Created more than 700 gold-standard Python and STEM triads facilitating Chain-of-Thought reasoning for reinforcement learning from human feedback (RLHF) training of advanced AI models. Evaluated and rated over 2,000 multi-turn prompt/response conversations in accordance with established Standard Operating Procedures (SOPs), incorporating Human-in-the-Loop workflows. Maintained exceptionally high quality control accuracy while progressing to a Reviewer role responsible for final dataset validation. • Authored datasets targeting model alignment and consistency improvements. • Focused on AI evaluation for robotics and technical domains. • Employed coding triad review and prompt engineering expertise. • Directly contributed to AI chatbot and LLM training cycles.