AI Evaluation Specialist (Contract)
As an AI Evaluation Specialist at Mercor Platform / AI Lab Project, I executed RLHF by ranking model-generated outputs weekly for accuracy and multi-step reasoning. I generated high-quality prompt-response pairs for supervised fine-tuning (SFT) and documented edge cases to improve evaluation consistency. My efforts contributed directly to reward model training and overall data quality in AI development. • Performed weekly rankings of over 500 AI-generated outputs for accuracy and safety. • Authored 100+ gold-standard prompt-response pairs for technical and conversational SFT tasks. • Identified and reported subtle edge cases to enhance evaluation clarity. • Ensured all work adhered to strict quality assurance protocols for AI evaluation.