AI Model Evaluation & Data Annotation Experience | Freelancer | Human Preference & Feedback Labeling
I participated in human preference and ranking tasks for AI-generated textual outputs as part of reinforcement learning from human feedback. This involved structured evaluation of response helpfulness, clarity, accuracy, and naturalness. I provided justifications for preference judgement to improve RLHF signals during model fine-tuning. • Undertook comparative ranking of model-generated text outputs • Assessed each for accuracy, helpfulness, and fluency across subject domains • Provided structured written justifications for choices and rankings • Enhanced reinforcement learning processes with quality human feedback