LLM Response Generation and Quality Review
Worked on evaluating and refining AI-generated responses to improve clarity, accuracy, and instruction adherence. Reviewed model outputs against structured guidelines, identified inconsistencies and factual errors, and ensured responses met quality standards. Contributed to improving generated text performance for training and benchmarking purposes. Labeled responses based on predefined quality metrics, identified factual errors and inconsistencies, categorized outputs by correctness level, and provided structured feedback to support model fine-tuning and benchmarking. Maintained high consistency across large task volumes by carefully following evolving annotation guidelines. Contributed to improving dataset reliability and overall model evaluation quality.