RLHF Trainer and Rater (Outlier.ai, Soulhq.ai, Appen)
Participated in RLHF (Reinforcement Learning from Human Feedback) training projects to improve large language models’ outputs using platforms such as Outlier.ai and Soulhq.ai. Provided evaluative feedback and ratings on machine-generated text to enhance model performance and align outputs with human intent. Completed multiple annotation tasks requiring nuanced understanding of context, accuracy, and sentiment in English language data. • Executed feedback evaluation tasks to improve model relevance • Used Outlier.ai and Soulhq.ai for RLHF processes • Focused on English text generation and rating • Applied advanced comprehension in assessment of outputs