AI Data Trainer (Contract)
As an AI Data Trainer, I optimized the reasoning capabilities of large language models (LLMs) using Reinforcement Learning from Human Feedback (RLHF) and Supervised Fine-Tuning (SFT). I was responsible for evaluating complex code generations for technical accuracy and contributed to dataset creation for training purposes. This role required in-depth analysis of multi-turn logic and the minimization of model hallucinations. • Reviewed and rated model outputs in Python and Java. • Created labeled ground-truth datasets for code generation assessment. • Employed best practices in RLHF/SFT for AI safety and reliability. • Focused on multi-turn interactions and edge-case scenarios for robust LLM evaluation.