Ai Evaluation Researcher
• Applied advanced Statistical Modeling techniques (PCA, Clustering, ANOVA, Regression) to evaluate model outputs, identifying critical failures in numerical reasoning and analytical logic. • Conducted A/B Testing on high-complexity RLHF workflows and dataset-driven prompts, utilizing weighted evaluation rubrics (L0 metrics) to quantify and improve model reasoning capabilities. • Documented detailed Quality Control analysis including mathematical proofs and step-wise code expla nations to resolve model misinterpretations and optimize dataset quality