AI Trainer
Contributed to a suite of AI training and evaluation projects through Handshake, supporting large-scale RLHF (Reinforcement Learning from Human Feedback) and model quality improvement initiatives. Tasks included side-by-side comparison and ranking of AI-generated responses (Project Alloy – STEM focus), domain-specific model response evaluation (Projects Ohm and Watt), coding prompt creation and code response evaluation (Projects Chard and Chard 2.0), complex pull request identification and codebase analysis (Project Helix), and multi-modal task evaluation across audio, visual, and text inputs (Project Hedgehog). The projects ranged in scale from individual task-based contributions to ongoing contract work across multiple simultaneous pipelines. Quality measures included strict adherence to project-specific rubrics and evaluation guidelines, consistent calibration with project leads, attention to factual accuracy and reasoning quality in STEM domains, and timely delivery within task deadlines. Work was performed remotely under contract, maintaining a high standard of annotation quality to support model fine-tuning and alignment efforts.