Senior AI Training Specialist (Red Teaming & Evaluation)
I developed and led adversarial evaluation efforts to test LLM vulnerabilities using Red Teaming protocols. This involved designing and administering adversarial prompts to identify failures and risky behaviors. Results from these tests directly contributed to the creation of security patches and mitigation strategies. • Coordinated manual and automated adversarial prompt creation. • Managed data collection efforts for dangerous or undesirable model outputs. • Implemented documentation protocols to track discovered vulnerabilities. • Provided feedback to engineering teams to improve LLM safety and robustness.