LLM Evaluation and Text Annotation for Chatbot Development
Led a team to annotate and evaluate text datasets for the development of a conversational AI chatbot. The project involved: Data Labeling: Annotating text data for entity recognition (NER), sentiment analysis, and intent classification. LLM Evaluation: Evaluating the performance of large language models (e.g., GPT-3, T5) on tasks such as text summarization, question answering, and dialogue generation. Text Generation: Curating and generating high-quality training data to improve chatbot responses. Quality Assurance: Implementing rigorous quality control measures, including inter-annotator agreement checks, to ensure 95%+ accuracy in labeled data. Fine-tuning: Collaborating with AI engineers to fine-tune LLMs based on annotated data, resulting in a 20% improvement in chatbot response quality. The project involved over 50,000 text samples and was completed within a 6-month timeframe.