QA Program Manager and Engineer (GenAI Marketing Platform)
Led model evaluation and red-team testing for assessing LLM output quality and identifying edge cases prior to production deployment. Owned and executed structured QA processes for a GenAI ad creation platform, focusing on LLM assessment and model output validation. Developed and applied an experimentation framework to measure and validate model performance in real-world marketing scenarios. Validated generative AI outputs and assessed model behavior against defined success/failure criteria. Conducted red-team testing to probe for vulnerabilities, robustness, and abnormal LLM outputs. Collaborated closely with engineering and product teams to refine labeling and evaluation protocols. Utilized AI-first and proprietary QA tooling to automate and scale evaluation workflows.