RLHF Evaluator (Freelance)
As part of reinforcement learning from human feedback (RLHF) workflows, I evaluated AI-generated text for factual accuracy, relevance, and instruction adherence. I contributed to prompt evaluation, quality assessment, and improvement of language model outputs. This involved comprehensive review and structured reporting on large-scale text data for AI research. • Judged language model outputs for correctness. • Improved AI responses through continuous feedback. • Developed insights on prompt engineering. • Maintained dataset integrity for model fine-tuning.