GPT-4 Conversational AI Evaluation & Text Annotation
Developed and evaluated a GPT-4 powered chatbot used by 1,000+ users. Labeled and annotated text datasets for semantic search and conversational fine-tuning, including classification of entities, rating generated responses, and writing prompt–response pairs for supervised fine-tuning (SFT). Created evaluation frameworks to measure accuracy, coherence, and user satisfaction. The project involved thousands of conversation samples, with quality assurance through peer review and test metrics to ensure >90% annotation consistency.