Hunter Yeoman - AI Prompt Engineer - Large Language Models

Key Skills

Software

Data Annotation Tech

Top Subject Matter

No subject matter listed

Top Data Types

Text

Top Label Types

Evaluation Rating

Red Teaming

Freelancer Overview

I am an analytically driven mathematics graduate with hands-on experience in evaluating and red-teaming large language models, focusing on prompt engineering, model evaluation, and adversarial testing. My work involves designing structured evaluation rubrics, identifying and categorizing model failure patterns, and ensuring instruction adherence and factual accuracy in AI outputs. I have developed Python-based synthetic data environments to test model reasoning and have created frameworks for research synthesis integrity, emphasizing data reliability and cross-reference validation. My background in mathematical modeling, statistical analysis, and structured analytical writing enables me to deliver high-quality, precise labeling and annotation for AI training data, especially in complex domains such as mathematical reasoning and research synthesis. I am passionate about improving AI performance through meticulous data evaluation and annotation.

Entry LevelEnglish

Labeling Experience

AI Red Teaming and Prompt Testing

Data Annotation TechTextRed Teaming

Conducted red teaming of AI language models by crafting realistic, user-like prompts that challenged model reasoning and exposed failure modes such as logical errors and instruction-following issues. Evaluated responses and annotated errors according to guidelines to support model improvement

2025 - 2025

AI Response Evaluation and Annotation

Data Annotation TechTextEvaluation Rating

Worked on large-scale AI text labeling projects involving evaluation of language model outputs, contributing to approximately 300+ projects. Tasks included rating responses across dimensions such as instruction following, conciseness, completeness, tone, and harmfulness; performing pairwise comparison and ranking of outputs; and annotating errors by categorizing issues such as factual inaccuracies, logical errors, and instruction-following failures. Contributed to high-volume labeling efforts across diverse prompts and domains. Maintained quality by adhering to detailed annotation guidelines, applying consistent rubric-based evaluation, and ensuring accuracy and consistency in labeling decisions.

2025 - 2025

Education

T

Texas A&M University

Bachelor of Arts, Mathematics

Bachelor of Arts

2020 - 2024

Work History

F

Freelance

AI Prompt Engineer and Model Evaluation Specialist

Cypress, TX

2025 - Present

T

Texas A&M University

Instructional Session Leader – Advanced Calculus

College Station

2024 - 2024