For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Chrissean Body

Chrissean Body

AI Evaluation & Data Labeling Specialist and LLM Review, RLHF & Prompt QC

USA flagALABAMA, Usa
$30.00/hrExpertAws SagemakerAnno MageAppen

Key Skills

Software

AWS SageMakerAWS SageMaker
Anno-MageAnno-Mage
AppenAppen
Axiom AI
ClickworkerClickworker
CloudFactoryCloudFactory
DatatureDatature
Deep SystemsDeep Systems
DiffgramDiffgram
LabelboxLabelbox
MindriftMindrift
OneFormaOneForma
RemotasksRemotasks
SamaSama
Snorkel AISnorkel AI
Trilldata Technologies
V7 LabsV7 Labs
Scale AIScale AI
CVATCVAT

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
Computer Code ProgrammingComputer Code Programming
VideoVideo

Top Task Types

Computer Programming Coding
Prompt Response Writing SFT
RLHF
Text Generation
Text Summarization

Freelancer Overview

I am an experienced AI Evaluation and Data Labeling Specialist with over five years of hands-on work in rubric-based review, LLM evaluation, and multi-modal annotation. My expertise covers prompt and response analysis, RLHF, SFT, classification, and adversarial red-teaming. I specialize in detecting factual inaccuracies, bias, ambiguity, and policy non-compliance while delivering consistent, high-quality outputs that align with evolving guidelines. I have worked across multiple vendor platforms, including Scale AI, Surge AI, DataAnnotation.Tech, Outlier, Appen, TELUS International, and TaskUs, consistently maintaining accuracy rates above 98%. With a Master’s in Language Technology and a Bachelor’s in English and Linguistics from the University of Alabama, I bring both academic depth and applied experience in annotation frameworks, computational linguistics, and evaluation design. My background in teaching, copyediting, and auditing strengthens my ability to apply detailed rubrics, provide actionable feedback, and ensure cross-team consistency. I excel in high-volume, deadline-driven environments and am passionate about improving model fairness, safety, and overall performance through careful, detail-oriented evaluation.

ExpertEnglishSpanish

Labeling Experience

Snorkel AI

Academic Linguistics & Annotation Research

Snorkel AITextEntity Ner ClassificationRelationship
As part of graduate-level research in Language Technology, designed and applied annotation frameworks for syntax, semantics, and classification tasks. Built experimental rubrics to measure inter-rater consistency and applied hybrid rule-based/ML models to analyze annotated corpora. Delivered reports on annotation quality and evaluator agreement that informed faculty research and course development. This work provided the academic foundation for professional success in AI data labeling and evaluation.

As part of graduate-level research in Language Technology, designed and applied annotation frameworks for syntax, semantics, and classification tasks. Built experimental rubrics to measure inter-rater consistency and applied hybrid rule-based/ML models to analyze annotated corpora. Delivered reports on annotation quality and evaluator agreement that informed faculty research and course development. This work provided the academic foundation for professional success in AI data labeling and evaluation.

2023 - 2025
Scale AI

LLM Prompt and Response Evaluation

Scale AITextClassificationText Generation
Evaluated thousands of AI-generated responses for accuracy, clarity, fairness, and adherence to policy guidelines. Performed ranking tasks (RLHF) to improve reward model training, applied detailed rubrics to assess ambiguity, bias, and factual correctness, and created adversarial prompts for red-teaming to expose safety blind spots. Delivered second-pass audits to ensure quality consistency and drafted gold-standard exemplars to train new contributors. Maintained >98% accuracy and exceeded throughput benchmarks consistently.

Evaluated thousands of AI-generated responses for accuracy, clarity, fairness, and adherence to policy guidelines. Performed ranking tasks (RLHF) to improve reward model training, applied detailed rubrics to assess ambiguity, bias, and factual correctness, and created adversarial prompts for red-teaming to expose safety blind spots. Delivered second-pass audits to ensure quality consistency and drafted gold-standard exemplars to train new contributors. Maintained >98% accuracy and exceeded throughput benchmarks consistently.

2020 - 2024
Mindrift

Academic Linguistics & Annotation Research

MindriftTextEntity Ner ClassificationClassification
As part of graduate research in Language Technology, designed and applied annotation frameworks for syntax, semantics, and text classification. Built experimental rubrics to measure inter-rater consistency and applied hybrid rule-based/ML models to analyze annotated corpora. Delivered detailed reports on annotation quality and evaluator agreement that informed faculty research in computational linguistics. This academic foundation directly supports professional expertise in rubric-driven AI evaluation.

As part of graduate research in Language Technology, designed and applied annotation frameworks for syntax, semantics, and text classification. Built experimental rubrics to measure inter-rater consistency and applied hybrid rule-based/ML models to analyze annotated corpora. Delivered detailed reports on annotation quality and evaluator agreement that informed faculty research in computational linguistics. This academic foundation directly supports professional expertise in rubric-driven AI evaluation.

2017 - 2023
Appen

Multi-Modal Annotation – Text and Image Labeling

AppenImageClassificationQuestion Answering
Annotated and categorized large datasets of text and image content for search relevance, content moderation, and classification. Applied strict rubrics to ensure labeling consistency, flagged edge cases, and documented ambiguity to refine project guidelines. Conducted audits of contributor work, providing structured feedback that reduced error rates by 20%. Contributed to dataset quality that improved client-facing recommendation systems and content safety pipelines.

Annotated and categorized large datasets of text and image content for search relevance, content moderation, and classification. Applied strict rubrics to ensure labeling consistency, flagged edge cases, and documented ambiguity to refine project guidelines. Conducted audits of contributor work, providing structured feedback that reduced error rates by 20%. Contributed to dataset quality that improved client-facing recommendation systems and content safety pipelines.

2018 - 2020
CVAT

Autonomous Vehicle Perception Annotation

CVATVideoBounding BoxPolygon
Labeled objects such as vehicles, pedestrians, and traffic signs in video and LiDAR datasets to train self-driving car perception systems. Used bounding box, polygon, and object-tracking techniques to ensure high-accuracy spatial annotations. Applied QA rubrics to validate scene classifications and performed audits on contributor annotations to reduce labeling inconsistencies. Improved dataset reliability that directly enhanced perception and safety models for autonomous vehicles.

Labeled objects such as vehicles, pedestrians, and traffic signs in video and LiDAR datasets to train self-driving car perception systems. Used bounding box, polygon, and object-tracking techniques to ensure high-accuracy spatial annotations. Applied QA rubrics to validate scene classifications and performed audits on contributor annotations to reduce labeling inconsistencies. Improved dataset reliability that directly enhanced perception and safety models for autonomous vehicles.

2017 - 2019

Education

U

University of Alabama

Master of Arts, Language Technology

Master of Arts
2015 - 2017
U

University of Alabama

Bachelor of Arts, English and Linguistics

Bachelor of Arts
2011 - 2015

Work History

R

REMOTE

Freelance AI Evaluation Specialist

Birmingham
2020 - Present
I

Independent Contractor

Quality Reviewer / Copy Editor

Birmingham
2017 - 2020