For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Christopher Thomasson

Christopher Thomasson

Senior AI Evaluation Analyst | LLM Training, Data Labeling & AI Annotation Specialist (Remote)

USA flag
Los Angeles, Usa
$30.00/hrExpertLabelboxScale AISuperannotate

Key Skills

Software

LabelboxLabelbox
Scale AIScale AI
SuperAnnotateSuperAnnotate
TolokaToloka
ArgillaArgilla
Data Annotation TechData Annotation Tech
HiveMindHiveMind
MindriftMindrift
OneFormaOneForma
RemotasksRemotasks
Snorkel AISnorkel AI
Surge AISurge AI
TelusTelus
iMeritiMerit
Micro1
AWS SageMakerAWS SageMaker
Anno-MageAnno-Mage
Axiom AI
CloudFactoryCloudFactory
CrowdFlowerCrowdFlower
DataloopDataloop
MercorMercor

Top Subject Matter

Artificial Intelligence & Machine Learning
Natural Language Processing (NLP)
Data Annotation / AI Training

Top Data Types

TextText
DocumentDocument
Computer Code ProgrammingComputer Code Programming

Top Task Types

Classification
Prompt Response Writing SFT
RLHF
Computer Programming Coding
Transcription
Text Generation
Evaluation Rating
Question Answering
Text Summarization
Entity Ner Classification
Object Detection
Data Collection

Freelancer Overview

AI Evaluation and Data Annotation Specialist with over 5 years of experience supporting machine learning and AI model training through large-scale dataset annotation, model output evaluation, and data quality assurance. Evaluated AI-generated images for composition, lighting consistency, anatomical accuracy, and material realism while documenting recurring model failure patterns. Skilled in analyzing large language model (LLM) responses as well as AI-generated images for fidelity, consistency, and adherence to task instructions. Experienced in prompt design, evaluation workflows, dataset validation, and human-in-the-loop annotation systems used to improve AI training pipelines.

ExpertEnglishSpanish

Labeling Experience

Senior AI Evaluation Analyst (Remote)

ImageEvaluation Rating
Independent Contractor | 2022 – Present Reviewed and evaluated 40,000+ AI-generated outputs across multiple domains including text and visual datasets. Applied structured evaluation metrics for relevance, factual correctness, and reasoning quality. Designed labeling guidelines and scoring rubrics to ensure consistent annotation across teams. Maintained >99% accuracy while working on high-volume annotation tasks. Documented recurring model errors and dataset inconsistencies to improve AI training data.

Independent Contractor | 2022 – Present Reviewed and evaluated 40,000+ AI-generated outputs across multiple domains including text and visual datasets. Applied structured evaluation metrics for relevance, factual correctness, and reasoning quality. Designed labeling guidelines and scoring rubrics to ensure consistent annotation across teams. Maintained >99% accuracy while working on high-volume annotation tasks. Documented recurring model errors and dataset inconsistencies to improve AI training data.

2022 - Present

Prompt Optimization for Instruction Following (Research Project)

TextPrompt Response Writing SFT
For the Prompt Optimization for Instruction Following project, I tested and optimized prompt structures to generate high-quality labeled datasets for AI training. My analysis identified prompt patterns that improved AI response accuracy, directly supporting the instruction-following capabilities of downstream models. This work enhanced the development of supervised fine-tuning datasets for cutting-edge LLMs. • Designed and iterated prompts for effective instruction following • Evaluated model responses for use in supervised training data • Improved response accuracy for downstream model performance • Provided feedback on prompt strategies to increase dataset quality

For the Prompt Optimization for Instruction Following project, I tested and optimized prompt structures to generate high-quality labeled datasets for AI training. My analysis identified prompt patterns that improved AI response accuracy, directly supporting the instruction-following capabilities of downstream models. This work enhanced the development of supervised fine-tuning datasets for cutting-edge LLMs. • Designed and iterated prompts for effective instruction following • Evaluated model responses for use in supervised training data • Improved response accuracy for downstream model performance • Provided feedback on prompt strategies to increase dataset quality

2021 - Present

Human Evaluation Framework for LLM Reliability (Doctoral Research)

Text
As part of my PhD research, I developed and oversaw human evaluation frameworks for improving LLM reliability through rigorous annotation. I created scoring rubrics and improved labeling criteria clarity to reduce variance in human-labeled data. My work directly contributed to better data quality standards for large language model training. • Developed structured rubrics for annotation tasks • Coordinated human raters for consistent and unbiased evaluations • Analyzed variance in human-labeled outputs and improved training guidelines • Documented findings to inform future AI model dataset construction

As part of my PhD research, I developed and oversaw human evaluation frameworks for improving LLM reliability through rigorous annotation. I created scoring rubrics and improved labeling criteria clarity to reduce variance in human-labeled data. My work directly contributed to better data quality standards for large language model training. • Developed structured rubrics for annotation tasks • Coordinated human raters for consistent and unbiased evaluations • Analyzed variance in human-labeled outputs and improved training guidelines • Documented findings to inform future AI model dataset construction

2021 - Present

Machine Learning Research Associate

TextClassification
As a Machine Learning Research Associate, I prepared and validated large NLP datasets for AI experiments. I was responsible for clean and accurate labeling, as well as reporting on data quality and annotation protocols. My work included building Python pipelines and benchmarking dataset quality for model evaluation. • Utilized data processing for pre-processing, deduplication, standardization • Assisted in experimental evaluation of model performance using labeled data • Ensured annotation protocols were rigorously documented • Applied error detection and quality assurance procedures to labeled datasets

As a Machine Learning Research Associate, I prepared and validated large NLP datasets for AI experiments. I was responsible for clean and accurate labeling, as well as reporting on data quality and annotation protocols. My work included building Python pipelines and benchmarking dataset quality for model evaluation. • Utilized data processing for pre-processing, deduplication, standardization • Assisted in experimental evaluation of model performance using labeled data • Ensured annotation protocols were rigorously documented • Applied error detection and quality assurance procedures to labeled datasets

2020 - 2022

Data Quality Specialist (Remote)

TextClassification
As a Data Quality Specialist, I audited both structured and unstructured datasets to identify inconsistencies and labeling errors. I implemented cross-validation and other rule-based quality checks to improve overall annotation reliability. My efforts ensured large datasets were analysis-ready for downstream AI model training. • Organized and normalized large, complex datasets for machine learning workflows • Improved label reliability through auditing and corrective feedback • Conducted high-volume task execution focused on quality assurance • Supported remote annotation teams to enforce guideline compliance

As a Data Quality Specialist, I audited both structured and unstructured datasets to identify inconsistencies and labeling errors. I implemented cross-validation and other rule-based quality checks to improve overall annotation reliability. My efforts ensured large datasets were analysis-ready for downstream AI model training. • Organized and normalized large, complex datasets for machine learning workflows • Improved label reliability through auditing and corrective feedback • Conducted high-volume task execution focused on quality assurance • Supported remote annotation teams to enforce guideline compliance

2019 - 2020

Education

U

University of California

Master of Science, Computer Science

Master of Science
2019 - 2021
U

University of California

Bachelor of Science, Computer Science

Bachelor of Science
2015 - 2019

Work History

U

University of California

Machine Learning Research Associate

Los Angeles
2020 - 2022