LLM Data Annotation & Evaluation
Worked on large-scale LLM training and evaluation projects for Turing. Responsibilities included annotating and evaluating AI-generated text, ranking multiple responses, writing high-quality prompts and reference answers, and performing RLHF tasks to improve model alignment. Assessed outputs for factual accuracy, reasoning quality, safety, and instruction adherence based on detailed rubrics. Participated in red teaming and edge-case analysis to identify model weaknesses. Maintained high accuracy and consistency while meeting strict quality and guideline requirements across diverse task types.