Text Categorization and Evaluation for LLM Training
In this project, I was responsible for categorizing and evaluating large datasets of text used for training language models. My tasks included classifying text into predefined categories, evaluating the relevance and accuracy of AI-generated responses, and ensuring that text outputs adhered to quality standards. I worked with diverse content in English and Spanish, performing both manual annotations and quality checks to ensure the training data was of the highest standard. The project aimed to enhance the language model’s performance in real-world applications such as customer service and content generation.