AI Data Annotator & Evaluator
Evaluated and annotated outputs from large language models (LLMs) in multiple tasks to optimize AI system performance. Used predefined guidelines to assess accuracy, completeness, and quality of generated text responses. Handled both subjective and objective evaluation methods to enhance annotation reliability. • Conducted intent analysis, supervised fine-tuning (SFT), reward model ranking, and scoring. • Performed double-blind evaluations and ensured high annotation standards. • Provided feedback on model errors and improvement opportunities. • Contributed to auto-evaluation dataset creation and handled multilingual annotation tasks.