Oracle-Tier AI Trainer & Multidisciplinary Subject Matter Expert
CORE DOMAINS & RESPONSIBILITIES: 🚀 1. Advanced RLHF & SFT (Model Training): Executed complex Reinforcement Learning from Human Feedback (RLHF) and Supervised Fine-Tuning (SFT) workflows. Authored detailed SFT Justifications to train models on why a specific response is superior, focusing on logic, syntax, and safety. Conducted Side-by-Side (SbS) evaluations with strict adherence to complex grading rubrics. Performed deep Root Cause Analysis (RCA) to identify model hallucinations and logic failures. 🧮 2. Math Reasoning & Logic (STEM): Generated and evaluated Chain-of-Thought (CoT) prompts for complex mathematical problems (Calculus, Linear Algebra). Verified factual accuracy and logical consistency of model outputs using LaTeX formatting and rigorous proof-checking. Designed "Golden Data" sets (Ground Truth) used to benchmark model performance and train other annotators.