AI Prompt Evaluation and Text Annotation Project
I worked on AI training tasks involving text annotation, prompt evaluation, and response quality assessment. The project focused on improving the performance of large language models by reviewing prompts and rating generated responses based on accuracy, relevance, clarity, and safety. Key tasks included evaluating AI outputs, classifying text data, identifying hallucinations and inconsistencies, and comparing multiple responses to determine the most helpful answer. I followed structured annotation guidelines and quality scoring frameworks to ensure consistency and high-quality labeled datasets. The project involved reviewing large batches of prompts and responses, applying standardized labeling criteria, and providing feedback to improve model alignment and performance.