AI Safety Researcher — LLM Red Teaming & Prompt Injection Evaluation
Conducted red teaming and prompt injection testing to evaluate safety guardrails of large language models. Assessed responses from LLMs for compliance with safety objectives and identified vulnerabilities to adversarial prompts. Provided structured evaluations and detailed feedback to improve AI safety mechanisms. • Performed systematic safety assessment on multiple AI platforms • Analyzed LLM behavior under adversarial prompt conditions • Documented weaknesses and recommended safety improvements • Collaborated with developers to address safety bypass issues.