Aaliyah Ruiz - AI Training Specialist - Data Annotation & Evaluation

Key Skills

Software

CVAT

Data Annotation Tech

Datasaur

Doccano

Encord

Labelbox

Scale AI

SuperAnnotate

Surge AI

Top Subject Matter

No subject matter listed

Top Data Types

Image

Text

Video

Top Task Types

Action Recognition

Bounding Box

Classification

Entity (NER) Classification

Prompt + Response Writing (SFT)

Freelancer Overview

I am a detail-oriented AI data annotation and LLM evaluation specialist with extensive experience delivering high-quality training data for machine learning models. My background includes large-scale image, video, audio, and text annotation, as well as advanced NLP tasks such as named entity recognition, sentiment tagging, and multi-turn dialogue evaluation. I have contributed to projects involving RLHF, preference ranking, chain-of-thought analysis, and content safety assessments, consistently maintaining high accuracy and compliance with complex guidelines. Skilled in using tools like CVAT, Labelbox, SuperAnnotate, Surge AI, and Scale Loop, I excel at executing high-volume tasks efficiently while ensuring clarity, factual correctness, and reliable performance across multimodal datasets.

IntermediateEnglish

Labeling Experience

AI Training & Quality Rater

Surge AITextClassificationEvaluation Rating

Reviewed and evaluated 10,000+ LLM-generated outputs across text, speech-to-text transcripts, and video captions. Specific labeling tasks included content rating, transcript and caption validation, quality assessment, safety and bias checks, reasoning and chain-of-thought evaluation, and relevance judgments. Managed a high-volume workflow while adhering to strict rubric-based scoring and quality standards, delivering structured feedback to improve model performance using Surge AI, Alectio, Scale Loop, and Outlier’s evaluation interface. Ensured accuracy, consistency, and compliance across multimodal datasets.

2024

LLM Evaluation Contractor

Surge AITextClassificationEvaluation Rating

Completed 8,000+ RLHF tasks evaluating large-scale LLM outputs across text, audio, and captioned datasets. Tasks included preference ranking, pairwise comparison, instruction-following assessment, multi-turn dialogue evaluation, and content rating. Ensured high-quality results by adhering to strict rubric-based scoring, bias and safety checks, hallucination detection, and factual accuracy verification. Managed a high-volume workflow while maintaining 99% compliance with project guidelines, contributing to improved model performance and reliability.

2022 - 2023

AI Data Annotation Specialist

CVATImageBounding BoxPolygon

Completed 45,000+ image, video, audio, and text annotations across multimodal datasets using CVAT, Labelbox, SuperAnnotate, and Scale’s internal tools. Specific labeling tasks included audio transcription, video captioning, NER, sentiment tagging, dialogue annotation, content rating, and safety labeling. Managed a high-volume workflow with 1,500+ high-quality labels per week, adhering to strict project guidelines and quality audits, achieving 97%+ accuracy while maintaining consistency and timely delivery.

2021 - 2022

Education

N

New York University

Master of Science, Data Science

Master of Science

2022 - 2024

A

Arizona State University

Bachelor of Science, Information Systems

Bachelor of Science

2017 - 2021

Work History

U

Upwork

Online Academic Tutor

Sierra Vista

2021 - 2022

A

ABC Architectural Services

Project Coordinator

New York

2020 - 2021