AI Trainer/LLM Evaluation Specialist
Participated in reinforcement learning from human feedback (RLHF) as part of LLM evaluation cycles. Evaluated, ranked, and audited LLM outputs and responses for quality, helpfulness, bias, and hallucinations. Performed red teaming and process reward annotation tasks for model safety and robustness. • Created and reviewed preference pairs and DPO training data • Executed red teaming and hallucination audits for generated text • Applied multi-step reasoning and prompt engineering • Used Label Studio, Labelbox, and proprietary pipelines for annotation workflows.