Designer and Implementer of Data Annotation Workflow for LLM and RAG Output Evaluation
I designed and implemented human-in-the-loop data annotation workflows to create evaluation datasets for LLM and RAG outputs. Using internal or proprietary annotation tooling, I built processes to assess retrieval accuracy and response relevance of generative models applied to enterprise document intelligence. These workflows included systematic evaluation and error analysis to reduce hallucinations in contract analysis AI systems. • Developed and maintained evaluation datasets for legal and procurement documents • Collaborated with AI teams to capture model performance metrics and user feedback • Automated portions of the annotation workflow for efficiency and scalability • Ensured compliance with data privacy and security standards throughout annotation