For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
S
Svetlana Pak

Svetlana Pak

AI Evaluator / Red Teaming Specialist (Contract)

USA flagKalamazoo, Usa
$65.00/hrIntermediateMercor

Key Skills

Software

MercorMercor

Top Subject Matter

AI Safety
Adversarial Testing
LLM Evaluation

Top Data Types

TextText
DocumentDocument

Top Task Types

Red TeamingRed Teaming
ClassificationClassification

Freelancer Overview

AI Evaluator / Red Teaming Specialist (Contract). Brings 10+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Mercor and Outlier. Education includes Bachelor of Arts, Bishkek Humanitarian University and No Degree Title Provided, North Western Polytechnic University. AI-training focus includes data types such as Text and labeling workflows including Red Teaming, Evaluation, and Rating.

IntermediateEnglishRussian

Labeling Experience

Mercor

AI Evaluator / Red Teaming Specialist (Contract)

MercorTextRed Teaming
This role involved evaluating LLM outputs from adversarial stress tests and multi-stage jailbreak attempts. I identified critical vulnerabilities in model alignment and policy refusal logic, and conducted comparative evaluations of model responses, generating preference data for RLHF pipelines. I documented and classified complex model failure modes, providing actionable insights for AI model safety enforcement. • Analyzed and rated side-by-side model outputs for safety and alignment. • Generated preference data to directly inform RLHF (Reinforcement Learning from Human Feedback) training processes. • Classified failure modes such as semantic pivots and illicit technical validation cases. • Produced forensic insights bridging user intent with safety protocols.

This role involved evaluating LLM outputs from adversarial stress tests and multi-stage jailbreak attempts. I identified critical vulnerabilities in model alignment and policy refusal logic, and conducted comparative evaluations of model responses, generating preference data for RLHF pipelines. I documented and classified complex model failure modes, providing actionable insights for AI model safety enforcement. • Analyzed and rated side-by-side model outputs for safety and alignment. • Generated preference data to directly inform RLHF (Reinforcement Learning from Human Feedback) training processes. • Classified failure modes such as semantic pivots and illicit technical validation cases. • Produced forensic insights bridging user intent with safety protocols.

2026 - Present
Mercor

AI Data Annotator (Contract)

MercorTextClassification
In this contract role, I performed high-volume data annotation and labeling tasks for NLP and LLM training datasets. My focus was on intent, sentiment, and relevance classification, providing consistent, actionable feedback for project guideline improvements. I helped enhance documentation and throughput, supporting the quality of training data used for NLP models. • Conducted classification tasks for text intent and sentiment. • Labeled large volumes of data for NLP and LLM projects. • Provided guideline feedback to project managers to improve clarity. • Enhanced annotation throughput and documentation quality.

In this contract role, I performed high-volume data annotation and labeling tasks for NLP and LLM training datasets. My focus was on intent, sentiment, and relevance classification, providing consistent, actionable feedback for project guideline improvements. I helped enhance documentation and throughput, supporting the quality of training data used for NLP models. • Conducted classification tasks for text intent and sentiment. • Labeled large volumes of data for NLP and LLM projects. • Provided guideline feedback to project managers to improve clarity. • Enhanced annotation throughput and documentation quality.

2025 - 2025

AI Course Contributor / Data Annotator (Contract)

Text
I contributed as an AI Course Contributor and Data Annotator by evaluating and ranking comparative LLM responses in English and Russian. This involved applying complex rubrics to assess grammar, tone, helpfulness, and persona consistency across outputs. I also flagged safety violations and edge cases involving sensitive content and historical figures. • Reviewed and rated LLM outputs for localization, grounding, and truthfulness. • Applied detailed evaluation rubrics to multiple language outputs. • Identified policy violations and risk edge cases in sensitive topics. • Ensured model consistency in persona and helpfulness metrics.

I contributed as an AI Course Contributor and Data Annotator by evaluating and ranking comparative LLM responses in English and Russian. This involved applying complex rubrics to assess grammar, tone, helpfulness, and persona consistency across outputs. I also flagged safety violations and edge cases involving sensitive content and historical figures. • Reviewed and rated LLM outputs for localization, grounding, and truthfulness. • Applied detailed evaluation rubrics to multiple language outputs. • Identified policy violations and risk edge cases in sensitive topics. • Ensured model consistency in persona and helpfulness metrics.

2025 - 2025

Education

N

North Western Polytechnic University

No Degree Title Provided, Linguistics

No Degree Title Provided
Not specified
B

Bishkek Humanitarian University

Bachelor of Arts, Regional Studies (China)

Bachelor of Arts
Not specified

Work History

R

Rooster Products International

Senior Program Manager

Shenzhen
2012 - 2016
R

Robal Australia

Sourcing and Development Manager

Guangzhou
2007 - 2012