Mathematics Problem Design & AI Data Labeling for LLM Training
Developed and labeled large-scale datasets for training and evaluating large language models in advanced mathematics. Tasks included generating over 400 original problems in algebra, probability, and number theory, writing step-by-step solutions, and creating evaluation rubrics to measure AI performance. Labeled AI outputs for correctness, logical consistency, and clarity while applying automated proof-checking tools (SageMath, SymPy). Collaborated with AI researchers to align datasets with project objectives, improving mathematical reasoning accuracy of LLMs by 30%. Ensured dataset quality through systematic review cycles and statistical validation methods.