Edward Pierce - AI Data Trainer / Annotation Quality Lead

Key Skills

Software

Scale AI

AWS SageMaker

Label Studio

Top Subject Matter

Large Language Model Training

Instruction Following

Toxicity Detection

Top Data Types

Text

Image

Top Task Types

RLHF

Classification

Freelancer Overview

AI Data Trainer / Annotation Quality Lead. Brings 3+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Scale AI, AWS SageMaker, and Label Studio. Education includes Master of Science, Northeastern University (2023) and Bachelor of Science, University of Massachusetts Amherst (2021). AI-training focus includes data types such as Text and labeling workflows including RLHF and Classification.

Expert

Labeling Experience

AI Data Trainer / Annotation Quality Lead

Scale AITextRLHF

Led production of reinforcement learning with human feedback (RLHF) preference data for large language models, ensuring top-tier annotation quality. Designed detailed annotation guidelines and ontologies for instruction-following, toxicity detection, and factual accuracy labeling. Built Python quality assurance scripts to identify errors, lowering labeling error rate by 31% overall. • Authored calibration playbooks adopted by 3 client projects • Collaborated with ML engineers for model evaluation and feedback • Coordinated a team of 18 annotators and optimized workflow • Standardized project-level annotation complexity and agreement (Cohen's κ > 0.88)

2023 - Present

Data Science Intern — Alexa AI

Aws SagemakerTextClassification

Developed and validated training data pipelines for Alexa NLU models using semi-automated and human-in-the-loop annotation via AWS SageMaker Ground Truth. Processed and curated over 2 million utterances for slot-entity disambiguation and intent classification. Designed and executed workflow A/B experiments for optimizing annotation quality and efficiency. • Refined annotation process using active learning modules • Implemented and tested QA steps for label consistency • Presented insights and findings to cross-functional teams • Improved annotation throughput and dataset preparation timelines

2022 - 2023

Research Assistant — NLP Group (MIT CSAIL)

Label StudioTextClassification

Curated and annotated a 450,000-sentence benchmark dataset for cross-lingual entailment studies, coordinating labeling tasks and validating inter-annotator agreement. Used multi-round adjudication to ensure consensus and label quality for text classification tasks. Ran and tracked fine-tuning experiments on transformer-based models (RoBERTa/T5) to evaluate annotation impact. • Coordinated crowdsourced labeling for low-resource text data • Implemented adjudication workflow to resolve ambiguous cases • Produced experiment reports using Weights & Biases • Contributed to paper on low-confidence annotation strategies

2021 - 2022

Education

U

University of Massachusetts Amherst

Bachelor of Science, Computer Science and Statistics

Bachelor of Science

2017 - 2021

N

Northeastern University

Master of Science, Data Science

Master of Science

2023

Work History

A

Amazon

Data Science Intern

Boston

2022 - 2023

M

MIT Computer Science & AI Laboratory

Research Assistant

Cambridge

2021 - 2022