AI Training Engineer & RLHF Specialist
Served as an AI Training Engineer responsible for the optimization and alignment of Large Language Models (LLMs) through structured evaluation. Applied reinforcement learning from human feedback (RLHF) methods to improve model accuracy, safety, and instruction alignment. Conducted comprehensive quality checks of model outputs for systemic bias, logical correctness, and adherence to guidelines. • Evaluated model responses for hallucinations, inaccuracies, and instruction-following errors. • Provided human feedback directly used for model RLHF fine-tuning cycles. • Collaborated with engineering teams to implement feedback into training pipelines. • Ensured consistent top-tier annotation quality surpassing team benchmarks.