For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
K
Kyota

Kyota

AI Data Trainer (RLHF) / Squad Reviewer, Contributor, Reviewer

Japan flagTokyo, Japan
$45.00/hrExpertClickworkerData Annotation TechImerit

Key Skills

Software

ClickworkerClickworker
Data Annotation TechData Annotation Tech
iMeritiMerit
Scale AIScale AI
TolokaToloka

Top Subject Matter

AI model alignment
Reinforcement Learning from Human Feedback (RLHF)

Top Data Types

TextText
Computer Code ProgrammingComputer Code Programming

Top Task Types

RLHFRLHF
Fine-tuningFine-tuning
Computer Programming/CodingComputer Programming/Coding
Data CollectionData Collection
Function CallingFunction Calling
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)

Freelancer Overview

AI Data Trainer (RLHF) / Squad Reviewer, Contributor, Reviewer. Brings 4+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, Sunway University (2022) and Non-Degree Program, 42 Tokyo (2022). AI-training focus includes data types such as Text and labeling workflows including RLHF.

ExpertEnglishJapanese

Labeling Experience

AI Data Trainer (RLHF) / Squad Reviewer, Contributor, Reviewer

TextRLHF
As an AI Data Trainer at Outlier, I contributed to reinforcement learning from human feedback (RLHF) projects to improve AI behavior and language models. My work involved reviewing, evaluating, and providing feedback on AI-generated responses to ensure accuracy and alignment with human expectations. I utilized my expertise to enhance language models through hands-on evaluation and feedback cycles. • Conducted evaluation and feedback tasks for large language models (LLMs). • Collaborated with other contributors to align results with project standards. • Applied knowledge of AI and natural language processing in daily tasks. • Utilized proprietary or internal tools for labeling and RLHF tasks.

As an AI Data Trainer at Outlier, I contributed to reinforcement learning from human feedback (RLHF) projects to improve AI behavior and language models. My work involved reviewing, evaluating, and providing feedback on AI-generated responses to ensure accuracy and alignment with human expectations. I utilized my expertise to enhance language models through hands-on evaluation and feedback cycles. • Conducted evaluation and feedback tasks for large language models (LLMs). • Collaborated with other contributors to align results with project standards. • Applied knowledge of AI and natural language processing in daily tasks. • Utilized proprietary or internal tools for labeling and RLHF tasks.

2023 - Present

Education

4

42 Tokyo

Non-Degree Program, Computer Science

Non-Degree Program
2021 - 2022
S

Sunway University

Bachelor of Science, Computer Science

Bachelor of Science
2019 - 2022

Work History

O

Openupitengineer.Inc

Cloud Engineer

Tokyo
2023 - 2024
Y

Yaruki Switch Group

Python Programming Teacher

Tokyo
2021 - 2023