AI Trainer & Data Evaluation Specialist
I performed reinforcement learning from human feedback (RLHF) tasks on large language models, assessing response quality, safety, and accuracy. My work included prompt engineering, red teaming to uncover biases or risks, and multi-step reasoning annotation to support consistent model logic. I applied deep expertise in direct-response marketing standards to ensure content aligned with real-world campaign demands. • Graded model responses for factuality, tone, and helpfulness • Engineered adversarial prompts and conducted red teaming evaluations • Annotated datasets for chain-of-thought reasoning (multi-step logic) • Optimized outputs to meet direct-response copywriting criteria