For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Wen Yuan

Wen Yuan

LLM Evaluation & UI-Focused Prompt Reviewer with AI Training Experience

USA flagSeattle, Usa
$25.00/hrIntermediateOther

Key Skills

Software

Other

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code ProgrammingComputer Code Programming

Top Task Types

Computer Programming Coding

Freelancer Overview

I’m a Computer Science graduate with hands-on experience in full-stack development and AI model training. Over the past year, I’ve contributed to large language model evaluation projects, focusing on UI/UX feedback, prompt design, and output validation for English-language tasks. My background in frontend development (React, TypeScript, Tailwind CSS) allows me to bring a strong design and usability perspective to annotation work. I also have two years of experience in data analysis using Python and SQL, which sharpens my attention to detail and pattern recognition. In addition, I’m fluent in both Mandarin and English, and comfortable working with technical topics. I have experience reviewing prompt-response pairs, checking factual accuracy, and cross-referencing supporting documents—skills that align closely with the requirements of this project. I’m confident in my ability to complete this work efficiently while maintaining high standards of quality and consistency.

IntermediateEnglishChinese Mandarin

Labeling Experience

LLM Code Generation and Programming Response Evaluation

OtherComputer Code ProgrammingComputer Programming Coding
Evaluated LLM-generated code snippets and programming explanations for correctness, clarity, and alignment with the prompt. Reviewed Python, JavaScript, and HTML/CSS tasks to identify functional errors, logic flaws, or unclear output. Labeled issues such as missing edge cases, insecure practices, or hallucinated APIs. Also assessed prompt effectiveness in guiding code generation and provided structured feedback to enhance future model outputs.

Evaluated LLM-generated code snippets and programming explanations for correctness, clarity, and alignment with the prompt. Reviewed Python, JavaScript, and HTML/CSS tasks to identify functional errors, logic flaws, or unclear output. Labeled issues such as missing edge cases, insecure practices, or hallucinated APIs. Also assessed prompt effectiveness in guiding code generation and provided structured feedback to enhance future model outputs.

2024

Mandarin QA Pair Evaluation for Technical Domain

OtherTextPrompt Response Writing SFT
Reviewed 100+ Mandarin prompt-response pairs in a technical domain, with a focus on evaluating answer quality, clarity, and factual correctness. Tasks included identifying irrelevant or unsupported answers, validating against provided documents, and flagging weak justifications. Followed strict quality criteria and maintained consistency across 20–40 hours of detailed annotation work.

Reviewed 100+ Mandarin prompt-response pairs in a technical domain, with a focus on evaluating answer quality, clarity, and factual correctness. Tasks included identifying irrelevant or unsupported answers, validating against provided documents, and flagging weak justifications. Followed strict quality criteria and maintained consistency across 20–40 hours of detailed annotation work.

2024 - 2024

Education

U

University of Washington

Master of Science, Computer Science

Master of Science
2015 - 2017
W

Wuhan University

Bachelor of Science, Computer Science and Technology

Bachelor of Science
2011 - 2015

Work History

O

Outlier

AI Model Trainer

Remote
2024 - Present
A

Amazon

Full-Stack Software Engineer

Seattle
2020 - Present