AI Model Trainer & Evaluator | Outlier Freelance
Contributed to training large language models (LLMs) using Reinforcement Learning from Human Feedback, focusing on tasks such as code and math evaluation. Reviewed, rated, and improved AI-generated responses to Python programming and mathematics prompts. Applied structured evaluation criteria and technical expertise in coding to assess and enhance AI model accuracy. • Wrote and evaluated Python code to instruct the model and identify logic errors. • Rated mathematical problem sets to improve AI competency in reasoning. • Assessed outputs of search algorithms analogous to software QA. • Gained practical experience in LLM training pipeline methodologies.