Project: Hallucination Mitigation Study, Red Teaming Specialist
I conducted an in-depth analysis of hallucination failure points in AI models focused on DNA replication topics. My responsibilities included constructing adversarial prompts to identify and analyze model vulnerabilities and stress-test its safety guardrails. The project improved model robustness against generating inaccurate or unsafe scientific content. • Designed and administered adversarial test cases targeting model weak spots. • Logged, categorized, and explained failure modes related to biological content generation. • Provided corrective feedback and scientific citations for detected errors. • Helped develop new red-teaming protocols for ongoing safety evaluations.