AI Engineer
I worked as a RLHF (Reinforcement Learning Through Human Feedback) for Outlier AI's LLM with a focus on generating, evaluating, and labeling advanced Python code for their model. Responsibilities included devising novel problems for the model to solve, labeling its responses on a comprehensive and nuanced criteria as well as quality checking the labeling work performed by other engineers.