Grace Bailey - AI Trainer / Expert Evaluator · Outlier / Scale AI

Key Skills

Software

Scale AI

Mercor

Mindrift

Data Annotation Tech

Top Subject Matter

Humanities Domain Expertise

Creative Writing

Reasoning Domain Expertise

Top Data Types

Text

Image

Document

Top Task Types

RLHF

Object Detection

Text Generation

Question Answering

Text Summarization

Evaluation/Rating

Computer Programming/Coding

Classification

Bounding Box

Transcription

Data Collection

Fine-tuning

Prompt + Response Writing (SFT)

Entity (NER) Classification

Function Calling

Segmentation

Freelancer Overview

AI Trainer / Expert Evaluator · Outlier / Scale AI. Brings 4+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Scale AI, Mercor, and Mindrift. Education includes Master of Arts, University College London (2023) and Bachelor of Arts, University College London (2022). AI-training focus includes data types such as Text and labeling workflows including RLHF, Evaluation, and Rating.

IntermediateEnglishSpanishItalianFrench

Labeling Experience

AI Evaluator (Assessment Phase) · Alignerr

Text

Participating in assessment phase for expert evaluator and fact-checking analyst roles on the Alignerr platform, focusing on rubric-based annotation and review of AI-generated content. Tasks include policy & ethics review, historical argument fidelity assessment, and logical reasoning evaluations for LLMs. Applies subject expertise in classical studies, translation, and policy analysis to complex data labeling and alignment streams. • Assessment progress toward specialist roles in academic writing review and source fidelity. • Engaged in high-precision rubric design and evaluation activities. • Evaluates translation accuracy and identifies anachronistic reasoning errors in LLM outputs. • Ongoing participation in adversarial and instruction-following evaluation streams.

2026 - Present

AI Evaluator · Mindrift

MindriftText

Responsible for annotation, evaluation, and expert-graded assessment of AI outputs on Mindrift, with a focus on logic, reasoning, and instruction-following quality. Tasks include reviewing generated content for factual accuracy, logical consistency, and cultural nuance, particularly for humanities and classical content. Utilizes onboarding assessment performance to participate in preference ranking and instruction-following evaluation streams. • Completed both platform onboarding assessments with successful results. • Specializes in adversarial and instruction-following prompt evaluations for LLM outputs. • Applies editorial and linguistic expertise for precise content annotation. • Active engagement in academic writing and research-focused AI labeling streams.

2026 - Present

AI Trainer · Mercor

MercorText

Conducting expert evaluation and review of academic writing, research analysis, and fact-checking tasks specific to humanities, historical accuracy, and policy document streams. Qualified for higher-tier annotation and evaluation categories after passing multiple assessments and background checks on Mercor's platform. Actively participates in RLHF and rubric-guided AI training and evaluation, including logical reasoning assessments. • Cleared five platform assessments and background checks for specialist task eligibility. • Delivers high-precision annotation for classical, ethical, and policy-oriented data. • Engages in ongoing expert-graded evaluation and model preference ranking tasks. • Applies advanced subject knowledge in classical languages, source fidelity, and research analysis.

2026 - Present

AI Trainer / Expert Evaluator · Outlier / Scale AI

Scale AITextRLHF

Completed over 1,200 data annotation and evaluation tasks focusing on creative writing, reasoning, instruction-following, and adversarial prompt categories for AI model alignment and RLHF workflows. Utilized editorial precision and subject matter expertise in classics, rhetoric, and humanities to deliver expert-graded evaluations and align LLM outputs. Proficient in rubric-based evaluation, logical fallacy identification, and historical source fidelity tasks for multiple frontier AI training platforms. • Consistently awarded highest performance grades across various labeling and evaluation streams. • Delivered annotation and evaluation services using platforms such as Scale AI, Outlier, and Millennium Leaf. • Special qualifications for journalism, academic writing, and domain-expert task types, including Latin and Ancient Greek content. • Ongoing professional development through five completed Anthropic training courses and advanced assessment streams.

2024 - Present

Education

U

University College London

Master of Arts, Classics

Master of Arts

2022 - 2023

U

University College London

Bachelor of Arts, Classics

Bachelor of Arts

2019 - 2022

Work History

S

St George’s House

Rapporteur

Windsor

2024 - Present

S

San Clemente

Podcast Producer

London

2023 - Present