Math / Python & AI Code Evaluator
Evaluated AI-generated mathematics solutions and Python code for correctness, clarity, and computational efficiency. Created reference solutions and feedback templates to enhance model performance and acceptance rates. Provided regular reporting to inform training and prompt engineering teams on error classes and improvements. • Designed workflow documentation and KPI dashboards for scalable assessment. • Generated annotated test cases and example solutions for engineering reproducibility. • Delivered rubric-driven, consistent assessments for advanced cases per month. • Collaborated with cross-functional teams to refine evaluation approaches.