AI Safety Expert, Reviewer and SFT Annotator
As an AI Safety Expert, Reviewer, and SFT Annotator at Mercor, I evaluated and annotated training data for reinforcement learning from human feedback (RLHF) purposes. My responsibilities included reviewing AI-generated outputs and providing feedback to ensure alignment, safety, and high-quality model behavior. The work focused on performance evaluation and accurate annotation to support fine-tuning of large language models. • Conducted performance reviews of AI outputs for safety and quality. • Evaluated and annotated text data for RLHF training. • Focused on model alignment and bias mitigation. • Used internal or proprietary tooling for annotation and review tasks.