For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Joseph Yokobori

Joseph Yokobori

Evaluated and Refined LLM-generated text in Japanese & English

Japan flagTokyo, Japan
$30.00/hrEntry LevelData Annotation TechScale AIOther

Key Skills

Software

Data Annotation TechData Annotation Tech
Scale AIScale AI
Other

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
ImageImage
TextText

Top Task Types

Classification
Prompt Response Writing SFT
Translation Localization

Freelancer Overview

I have worked as a freelance AI evaluator for over four months, contributing more than 200 hours to multiple projects. My work focused on reviewing and improving LLM-generated responses in both Japanese and English. My responsibilities included evaluating output quality, translating responses, and editing to enhance fluency and contextual accuracy. I also handled categorization tasks, such as identifying adversarial prompts and assessing potential risks. Throughout these roles, I consistently met the quality and speed standards while adapting to increasingly complex tasks, including beta-phase projects requiring fair and nuanced judgment. In addition, I helped design fine-grained criteria for a range of response types and used them to refine model outputs. This involved identifying strengths and weaknesses in LLM-generated content, proposing concrete improvements, and ensuring alignment with the user intent. I also conducted evaluations in STEM-related fields, particularly biology, drawing on my academic background to assess technical accuracy and clarity. My experience reflects not only my strong linguistic skills but also structured analytical thinking and a deep understanding of how human feedback shapes language model development.

Entry LevelEnglishJapaneseChinese Mandarin

Labeling Experience

Scale AI

Cypher

Scale AITextTranslation LocalizationEvaluation Rating
Contributed to Cypher_Evals model evaluations by comparing pairs of LLM-generated responses. Each pair was assessed based on detailed criteria, including instruction following, truthfulness, localization, and verbosity. In addition to rating individual aspects, I provided an overall judgement on which response was better and explained the reasoning behind my choice. These tasks were completed under strict time and quality constraints, requiring both precision and efficacy.

Contributed to Cypher_Evals model evaluations by comparing pairs of LLM-generated responses. Each pair was assessed based on detailed criteria, including instruction following, truthfulness, localization, and verbosity. In addition to rating individual aspects, I provided an overall judgement on which response was better and explained the reasoning behind my choice. These tasks were completed under strict time and quality constraints, requiring both precision and efficacy.

2025
Data Annotation Tech

Creation of Fine-Grained Criteria

Data Annotation TechTextClassificationTranslation Localization
Created fine-grained evaluation criteria to assess the quality of LLM-generated responses, particularly in tasks requiring nuanced judgement, including both general and STEM-related prompts. These criteria were used to distinguish between acceptable and perfect outputs, guide revisions, and support the development of creating ideal responses.

Created fine-grained evaluation criteria to assess the quality of LLM-generated responses, particularly in tasks requiring nuanced judgement, including both general and STEM-related prompts. These criteria were used to distinguish between acceptable and perfect outputs, guide revisions, and support the development of creating ideal responses.

2025

Education

P

Peking University

Emersion Program, Chinese Language

Emersion Program
2023 - 2023
W

Waseda University

Bachelor Of Science, Life Science And Medical Bioscience

Bachelor Of Science
2022

Work History

U

University of Kansas Medical Center

Research Intern

Kansas
2024 - 2024