Christopher Thomasson - Senior AI Evaluation Analyst | LLM Training, Data Labeling & AI Annotation Specialist (Remote)

Key Skills

Software

Labelbox

Scale AI

SuperAnnotate

Toloka

Argilla

Data Annotation Tech

HiveMind

Mindrift

OneForma

Remotasks

Snorkel AI

Surge AI

Telus

iMerit

Micro1

AWS SageMaker

Anno-Mage

Axiom AI

CloudFactory

CrowdFlower

Dataloop

Mercor

Top Subject Matter

Artificial Intelligence & Machine Learning

Natural Language Processing (NLP)

Data Annotation / AI Training

Top Data Types

Text

Document

Computer Code Programming

Top Task Types

Classification

Prompt Response Writing SFT

RLHF

Computer Programming Coding

Transcription

Text Generation

Evaluation Rating

Question Answering

Text Summarization

Entity Ner Classification

Object Detection

Data Collection

Freelancer Overview

AI Evaluation and Data Annotation Specialist with over 5 years of experience supporting machine learning and AI model training through large-scale dataset annotation, model output evaluation, and data quality assurance. Evaluated AI-generated images for composition, lighting consistency, anatomical accuracy, and material realism while documenting recurring model failure patterns. Skilled in analyzing large language model (LLM) responses as well as AI-generated images for fidelity, consistency, and adherence to task instructions. Experienced in prompt design, evaluation workflows, dataset validation, and human-in-the-loop annotation systems used to improve AI training pipelines.

ExpertEnglishSpanish

Labeling Experience

Senior AI Evaluation Analyst (Remote)

ImageEvaluation Rating

Independent Contractor | 2022 – Present Reviewed and evaluated 40,000+ AI-generated outputs across multiple domains including text and visual datasets. Applied structured evaluation metrics for relevance, factual correctness, and reasoning quality. Designed labeling guidelines and scoring rubrics to ensure consistent annotation across teams. Maintained >99% accuracy while working on high-volume annotation tasks. Documented recurring model errors and dataset inconsistencies to improve AI training data.

2022 - Present

Prompt Optimization for Instruction Following (Research Project)

TextPrompt Response Writing SFT

For the Prompt Optimization for Instruction Following project, I tested and optimized prompt structures to generate high-quality labeled datasets for AI training. My analysis identified prompt patterns that improved AI response accuracy, directly supporting the instruction-following capabilities of downstream models. This work enhanced the development of supervised fine-tuning datasets for cutting-edge LLMs. • Designed and iterated prompts for effective instruction following • Evaluated model responses for use in supervised training data • Improved response accuracy for downstream model performance • Provided feedback on prompt strategies to increase dataset quality

2021 - Present

Human Evaluation Framework for LLM Reliability (Doctoral Research)

Text

As part of my PhD research, I developed and oversaw human evaluation frameworks for improving LLM reliability through rigorous annotation. I created scoring rubrics and improved labeling criteria clarity to reduce variance in human-labeled data. My work directly contributed to better data quality standards for large language model training. • Developed structured rubrics for annotation tasks • Coordinated human raters for consistent and unbiased evaluations • Analyzed variance in human-labeled outputs and improved training guidelines • Documented findings to inform future AI model dataset construction

2021 - Present

Machine Learning Research Associate

TextClassification

As a Machine Learning Research Associate, I prepared and validated large NLP datasets for AI experiments. I was responsible for clean and accurate labeling, as well as reporting on data quality and annotation protocols. My work included building Python pipelines and benchmarking dataset quality for model evaluation. • Utilized data processing for pre-processing, deduplication, standardization • Assisted in experimental evaluation of model performance using labeled data • Ensured annotation protocols were rigorously documented • Applied error detection and quality assurance procedures to labeled datasets

2020 - 2022

Data Quality Specialist (Remote)

TextClassification

As a Data Quality Specialist, I audited both structured and unstructured datasets to identify inconsistencies and labeling errors. I implemented cross-validation and other rule-based quality checks to improve overall annotation reliability. My efforts ensured large datasets were analysis-ready for downstream AI model training. • Organized and normalized large, complex datasets for machine learning workflows • Improved label reliability through auditing and corrective feedback • Conducted high-volume task execution focused on quality assurance • Supported remote annotation teams to enforce guideline compliance

2019 - 2020

Education

U

University of California

Master of Science, Computer Science

Master of Science

2019 - 2021

U

University of California

Bachelor of Science, Computer Science

Bachelor of Science

2015 - 2019

Work History

U

University of California

Machine Learning Research Associate

Los Angeles

2020 - 2022