Trust & Safety / LLM Evaluation Specialist
Assessed and labeled AI-generated text outputs in Spanish and English for Trust & Safety and LLM evaluation. Conducted adversarial testing, red-teaming, and policy alignment checks to strengthen model robustness and ensure safety compliance. Performed RLHF labeling, content categorization, and output ranking for reinforcement learning from human feedback. • Evaluated text generation quality, correctness, and policy compliance • Applied emotional resilience in sensitive content scenarios • Used structured guidelines for labeling and rating • Supported ongoing model improvement and dataset quality