GenAI Evaluation & Knowledge-Grounded Data Annotation for Professional Domains
Led and contributed to large-scale data labeling and evaluation initiatives supporting production-grade Generative AI and neuro-symbolic systems in regulated professional domains. The project involved designing and executing high-quality annotation workflows for training, fine-tuning, and evaluating LLM-powered applications used in legal and financial knowledge workflows. Key responsibilities included annotating and reviewing thousands of data samples across multiple task types, including prompt–response pairs for supervised fine-tuning (SFT), human preference ratings for RLHF-style evaluation, entity and concept labeling aligned with domain ontologies, document-level summarization, and question–answer validation grounded in authoritative source documents. Developed detailed annotation guidelines to ensure consistency, correctness, and regulatory alignment, with a strong emphasis on factual accuracy, reasoning quality, explainability, and bias mitigation. Performed adversarial and re