Outlier.AI Tier 3 Software Engineer for AI Training
I worked as a Tier 3 (Expert) Software Engineer contributor focusing on training LLMs in Python, R, and STEM subjects. The scope included RLHF (Reinforcement Learning from Human Feedback) and SFT (Supervised Fine-Tuning). My specific tasks involved evaluating AI-generated code for syntax errors, logic, and efficiency, as well as rewriting responses to meet strict "Gold Standard" quality guidelines. I also created original, complex prompts requiring multi-step reasoning to stress-test the models. I consistently maintained a high quality score, ensuring code executability and factual accuracy.