Pangoling Safety
The project aims to refine the refusal and safety mechanisms of AI models. It ensures that when a user asks for something dangerous (e.g., illegal instructions, hate speech, or self-harm), the model responds safely, neutrally, and according to strict ethical guidelines without being overly restrictive on benign topics. Sub-projects: - Pangolin Text: (mostly) Focusing on written dialogue and reasoning. - Pangolin Vision: Evaluating how AI interprets and responds to sensitive or unsafe images.