Software Engineer for Training AI Data (RLHF Annotator)
As a Software Engineer for Training AI Data at Outlier, I participated in RLHF (Reinforcement Learning from Human Feedback) projects specifically related to data science, Python programming, and Spanish language tasks. My work involved providing human feedback, evaluating model responses, and fine-tuning large language models through RLHF methodologies. I contributed to generalist RLHF projects and ensured the quality of AI responses aligned with expected standards. • Evaluated and rated AI model outputs in Spanish and English. • Provided feedback for Python programming-focused RLHF tasks. • Participated in human-in-the-loop AI training for data science projects. • Collaborated remotely and maintained high annotation consistency.