Safety Red Teamer
The tasks in Mercor included reviewing and analyzing the AI generated responses. It is done by rating, writing their strengths, area of improvements and comparing different types of AI model's responses.
Hire this AI Trainer
Sign in or create an account to invite AI Trainers to your job.
I possess hands-on experience in AI data training and model evaluation, highlighted by a specialized work trial in AI Red Teaming at Mercor. In this role, I focused on adversarial testing by actively probing and attempting to bypass the model's safety policies to identify critical vulnerabilities. This rigorous safety red teaming involved stress-testing the model's guardrails to evaluate its robustness and ensure strict alignment with ethical and safety protocols. In addition to vulnerability testing, I conducted comprehensive qualitative evaluations of the AI's responses through a structured rating system. This process required me to systematically analyze model outputs, document the model's core strengths, and pinpoint specific Areas of Improvement. By combining targeted adversarial attacks with detailed performance assessments, I developed a strong foundation in identifying edge cases and contributing to the iterative enhancement of AI safety and response quality.
The tasks in Mercor included reviewing and analyzing the AI generated responses. It is done by rating, writing their strengths, area of improvements and comparing different types of AI model's responses.
The task was a work trail in Mercor for AI Red Teaming where I have to trick the LLM to bypass/break its safety guidelines and policies to find the LLM model's weaknesses.
Higher Secondary Education, Science
Bachelor of Computer Applications, Computer Applications
Banisha J. hasn’t added any Work History to their OpenTrain profile yet.