For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
W

Wenjing

AI Model Evaluation

Canada flagVancouver, Canada
$20.00/hrEntry Level

Key Skills

Software

No software listed

Top Subject Matter

AI-generated content evaluation
scientific reasoning
STEM

Top Data Types

TextText
AudioAudio
ImageImage

Top Task Types

Transcription
RLHF

Freelancer Overview

AI Model Evaluation & Data Annotation Experience | Freelancer | Text-Based AI Evaluation. Brings 9+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Doctor of Philosophy, Chinese Academy of Sciences (2016). AI-training focus includes data types such as Text, Audio, and Image and labeling workflows including Evaluation, Rating, and Transcription.

Entry LevelEnglish

Labeling Experience

AI Model Evaluation & Data Annotation Experience | Freelancer | Human Preference & Feedback Labeling

TextRLHF
I participated in human preference and ranking tasks for AI-generated textual outputs as part of reinforcement learning from human feedback. This involved structured evaluation of response helpfulness, clarity, accuracy, and naturalness. I provided justifications for preference judgement to improve RLHF signals during model fine-tuning. • Undertook comparative ranking of model-generated text outputs • Assessed each for accuracy, helpfulness, and fluency across subject domains • Provided structured written justifications for choices and rankings • Enhanced reinforcement learning processes with quality human feedback

I participated in human preference and ranking tasks for AI-generated textual outputs as part of reinforcement learning from human feedback. This involved structured evaluation of response helpfulness, clarity, accuracy, and naturalness. I provided justifications for preference judgement to improve RLHF signals during model fine-tuning. • Undertook comparative ranking of model-generated text outputs • Assessed each for accuracy, helpfulness, and fluency across subject domains • Provided structured written justifications for choices and rankings • Enhanced reinforcement learning processes with quality human feedback

2026 - Present

AI Model Evaluation & Data Annotation Experience | Freelancer | Image Evaluation & Pairwise Ranking

Image
I conducted structured evaluation and pairwise comparison of AI-generated images for realism, accuracy, and naturalness. My role included identifying visual reasoning errors and inconsistencies within provided images. Pairwise ranking was used to determine preferred or higher quality outputs for model improvement. • Compared images for alignment with scientific context and environmental cues • Assessed realism, factual consistency, and logic of scene interactions • Flagged discrepancies such as object direction errors or unrealistic details • Supported evaluation protocol by giving comprehensive justifications for rankings

I conducted structured evaluation and pairwise comparison of AI-generated images for realism, accuracy, and naturalness. My role included identifying visual reasoning errors and inconsistencies within provided images. Pairwise ranking was used to determine preferred or higher quality outputs for model improvement. • Compared images for alignment with scientific context and environmental cues • Assessed realism, factual consistency, and logic of scene interactions • Flagged discrepancies such as object direction errors or unrealistic details • Supported evaluation protocol by giving comprehensive justifications for rankings

2026 - Present

AI Model Evaluation & Data Annotation Experience | Freelancer | Audio/Speech Annotation

AudioTranscription
I verified and corrected AI-driven speech-to-text transcriptions for accuracy and completeness. This involved annotating and labeling audio data for specific speech and non-verbal audio events. My work ensured transcription fidelity and detailed structured alignment to the source data. • Annotated audio segments with detailed event labels (pauses, fillers, non-verbal cues) • Performed quality control checks on speech recognition output • Corrected errors in AI transcription models to support training data pipeline • Documented ambiguous or challenging cases for further review

I verified and corrected AI-driven speech-to-text transcriptions for accuracy and completeness. This involved annotating and labeling audio data for specific speech and non-verbal audio events. My work ensured transcription fidelity and detailed structured alignment to the source data. • Annotated audio segments with detailed event labels (pauses, fillers, non-verbal cues) • Performed quality control checks on speech recognition output • Corrected errors in AI transcription models to support training data pipeline • Documented ambiguous or challenging cases for further review

2026 - Present

AI Model Evaluation & Data Annotation Experience | Freelancer | Text-Based AI Evaluation

Text
I performed comprehensive evaluation and annotation of AI-generated text outputs using structured, rubric-based frameworks. Tasks included assessing logical consistency, factual accuracy, scientific correctness, and clarity of text responses for model alignment. I applied detailed feedback protocols to ensure high-quality AI performance and model improvements. • Scored model responses based on criteria such as reasoning depth and factual correctness • Identified logic gaps, hallucinations, and ambiguous or incomplete answers • Provided structured written feedback to support RLHF and model tuning • Ensured rubric and guideline consistency across evaluation tasks

I performed comprehensive evaluation and annotation of AI-generated text outputs using structured, rubric-based frameworks. Tasks included assessing logical consistency, factual accuracy, scientific correctness, and clarity of text responses for model alignment. I applied detailed feedback protocols to ensure high-quality AI performance and model improvements. • Scored model responses based on criteria such as reasoning depth and factual correctness • Identified logic gaps, hallucinations, and ambiguous or incomplete answers • Provided structured written feedback to support RLHF and model tuning • Ensured rubric and guideline consistency across evaluation tasks

2026 - Present

Education

C

Chinese Academy of Sciences

Doctor of Philosophy, Biochemistry and Molecular Biology

Doctor of Philosophy
2010 - 2016

Work History

C

CCOA Therapeutics

Project Coordinator and CFO Assistant

Toronto
2021 - Present
U

Unity Health Toronto

Senior Research Associate and Postdoctoral Fellow

Toronto
2018 - 2025