Evaluator & Vulnerability Auditor — 15 Research Lab
Audited, labeled, and verified vulnerabilities in major AI evaluation frameworks by conducting structured safety and security testing. Identified issues in judge and classifier prompts and contributed confirmed findings to open issues and pull requests for remediation. Documented types of prompt injection and adversarial behaviors in frameworks such as ControlArena and HarmBench. • Executed targeted prompt injection labeling and evaluation on leading frameworks. • Provided structured vulnerability reports with detailed labeling of adversarial events. • Maintained traceable logs of attack attempts and framework responses. • Supported external confirmation and remediation of labeling-based findings.