Mathematics Expertise Sought for AI Training
In my role at Outlier.ai (formerly Remotasks), I worked on an AI training project using Reinforcement Learning from Human Feedback (RLHF) to improve language models in mathematics tutoring and complex reasoning. My tasks included prompt and response writing, evaluating model outputs, and ensuring high-quality data labeling standards. This involved creating challenging, university-level math and logic prompts and reviewing model responses for accuracy and coherence. Due to regional restrictions, my participation was limited despite the rewarding experience.