Candidates must have a bachelor's degree or higher in a relevant field such as Communications, Linguistics, Psychology, Law/Policy, or Security Studies, or possess equivalent professional experience. Near-native or native Japanese proficiency and at least C1-level English proficiency are required. Experience in trust and safety, content moderation, adversarial testing/red-teaming LLMs, and strong analytical writing skills are essential. Emotional resilience is important due to potential exposure to explicit or disturbing content. The project focuses on labeling and quality-checking safety data related to hate speech, harassment, sexual content, violence, self-harm, bias, illegal activities, and misinformation in Japanese and English AI-generated content. Contributors will perform red-teaming, adversarial testing, apply safety policies, detect cultural nuances, analyze policy alignment, and recommend mitigations to improve the overall safety of leading AI models.
Estimated Total Earnings
$3,000.00
Pay per Hour
$30.00/hr
Time Requirement
20+ hrs/week
Duration
3-6 months
Safety review of AI-generated Japanese and English text
Software
Hiring Type
Required Location
Workload / Schedule
Expected weekly commitment is at least 20 hours. Project duration is expected to run for 3 to 6 months. Labelers should follow milestone deadlines and quality checkpoints.
Software
Data Type
Label Types
Subject Matter / Industry
Language
Proposals: 0
Invites sent: 0
Unanswered invites: 0
Share link