Senior AI Trainer & AI Quality Lead
Led structured large language model (LLM) evaluation initiatives, focusing on increasing response accuracy and refining reward modeling. Designed and implemented model performance tracking for annotation consistency and medical safety alignment. Collaborated with ML engineers to improve RLHF scoring criteria and evaluation rubrics. • Led adversarial testing and safety audits to reduce high-risk outputs • Built KPI dashboards for annotation throughput and edge-case failures • Improved structured feedback loops for model optimization • Increased model response accuracy by 22%