Safety Alignment: Toxicity, Risk Mitigation & Response Correction
Evaluated high-risk prompts and model outputs for safety violations, misinformation, bias, and harmful reasoning. Generated safer alternatives, rewrote risky outputs, and tagged safety categories with high precision. Worked on datasets used for training AI systems to avoid harmful or unethical behaviors.