For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Aaliyah Ruiz

Aaliyah Ruiz

AI Training Specialist - Data Annotation & Evaluation

USA flag
Arizona, Usa
$20.00/hrIntermediateCVATData Annotation TechDatasaur

Key Skills

Software

CVATCVAT
Data Annotation TechData Annotation Tech
DatasaurDatasaur
DoccanoDoccano
EncordEncord
LabelboxLabelbox
Scale AIScale AI
SuperAnnotateSuperAnnotate
Surge AISurge AI

Top Subject Matter

No subject matter listed

Top Data Types

ImageImage
TextText
VideoVideo

Top Label Types

Action Recognition
Bounding Box
Classification
Entity Ner Classification
Prompt Response Writing SFT

Freelancer Overview

I am a detail-oriented AI data annotation and LLM evaluation specialist with extensive experience delivering high-quality training data for machine learning models. My background includes large-scale image, video, audio, and text annotation, as well as advanced NLP tasks such as named entity recognition, sentiment tagging, and multi-turn dialogue evaluation. I have contributed to projects involving RLHF, preference ranking, chain-of-thought analysis, and content safety assessments, consistently maintaining high accuracy and compliance with complex guidelines. Skilled in using tools like CVAT, Labelbox, SuperAnnotate, Surge AI, and Scale Loop, I excel at executing high-volume tasks efficiently while ensuring clarity, factual correctness, and reliable performance across multimodal datasets.

IntermediateEnglish

Labeling Experience

Surge AI

AI Training & Quality Rater

Surge AITextClassificationEvaluation Rating
Reviewed and evaluated 10,000+ LLM-generated outputs across text, speech-to-text transcripts, and video captions. Specific labeling tasks included content rating, transcript and caption validation, quality assessment, safety and bias checks, reasoning and chain-of-thought evaluation, and relevance judgments. Managed a high-volume workflow while adhering to strict rubric-based scoring and quality standards, delivering structured feedback to improve model performance using Surge AI, Alectio, Scale Loop, and Outlier’s evaluation interface. Ensured accuracy, consistency, and compliance across multimodal datasets.

Reviewed and evaluated 10,000+ LLM-generated outputs across text, speech-to-text transcripts, and video captions. Specific labeling tasks included content rating, transcript and caption validation, quality assessment, safety and bias checks, reasoning and chain-of-thought evaluation, and relevance judgments. Managed a high-volume workflow while adhering to strict rubric-based scoring and quality standards, delivering structured feedback to improve model performance using Surge AI, Alectio, Scale Loop, and Outlier’s evaluation interface. Ensured accuracy, consistency, and compliance across multimodal datasets.

2024
Surge AI

LLM Evaluation Contractor

Surge AITextClassificationEvaluation Rating
Completed 8,000+ RLHF tasks evaluating large-scale LLM outputs across text, audio, and captioned datasets. Tasks included preference ranking, pairwise comparison, instruction-following assessment, multi-turn dialogue evaluation, and content rating. Ensured high-quality results by adhering to strict rubric-based scoring, bias and safety checks, hallucination detection, and factual accuracy verification. Managed a high-volume workflow while maintaining 99% compliance with project guidelines, contributing to improved model performance and reliability.

Completed 8,000+ RLHF tasks evaluating large-scale LLM outputs across text, audio, and captioned datasets. Tasks included preference ranking, pairwise comparison, instruction-following assessment, multi-turn dialogue evaluation, and content rating. Ensured high-quality results by adhering to strict rubric-based scoring, bias and safety checks, hallucination detection, and factual accuracy verification. Managed a high-volume workflow while maintaining 99% compliance with project guidelines, contributing to improved model performance and reliability.

2022 - 2023
CVAT

AI Data Annotation Specialist

CVATImageBounding BoxPolygon
Completed 45,000+ image, video, audio, and text annotations across multimodal datasets using CVAT, Labelbox, SuperAnnotate, and Scale’s internal tools. Specific labeling tasks included audio transcription, video captioning, NER, sentiment tagging, dialogue annotation, content rating, and safety labeling. Managed a high-volume workflow with 1,500+ high-quality labels per week, adhering to strict project guidelines and quality audits, achieving 97%+ accuracy while maintaining consistency and timely delivery.

Completed 45,000+ image, video, audio, and text annotations across multimodal datasets using CVAT, Labelbox, SuperAnnotate, and Scale’s internal tools. Specific labeling tasks included audio transcription, video captioning, NER, sentiment tagging, dialogue annotation, content rating, and safety labeling. Managed a high-volume workflow with 1,500+ high-quality labels per week, adhering to strict project guidelines and quality audits, achieving 97%+ accuracy while maintaining consistency and timely delivery.

2021 - 2022

Education

N

New York University

Master of Science, Data Science

Master of Science
2022 - 2024
A

Arizona State University

Bachelor of Science, Information Systems

Bachelor of Science
2017 - 2021

Work History

U

Upwork

Online Academic Tutor

Sierra Vista
2021 - 2022
A

ABC Architectural Services

Project Coordinator

New York
2020 - 2021