AI Specialist - Remote (Data Labeling, RLHF, Red Teaming, Evaluation)
I developed and deployed over 1,000 complex prompts across various professional domains to assess model alignment using RLHF and A/B testing processes. My work involved structured evaluation of model outputs, identifying failures according to Helpful, Harmless, Honesty standards, and composing detailed performance reports. I conducted over 800 peer reviews to uphold model fidelity while creating prompt injection attack vectors to reveal vulnerability in safety guardrails. • Designed, reviewed, and rated a high volume of prompts and model responses using RLHF pipelines. • Performed peer review of AI model outputs and flagged inconsistency or low-quality samples in annotated datasets. • Engineered adversarial prompts to test and expose safety weaknesses and policy adherence in LLMs. • Provided structured, asynchronous feedback to project leads for rubric clarity, annotation reporting, and category alignment.