Alia Abbas - High-precision multimodal data labeler for vision, text, and audio

Key Skills

Software

AWS SageMaker

CVAT

Labelbox

Roboflow

Scale AI

SuperAnnotate

Top Subject Matter

No subject matter listed

Top Data Types

Image

Text

Video

Top Task Types

Audio Recording

Bounding Box

Polygon

Question Answering

Segmentation

Freelancer Overview

I’ve contributed to training data pipelines for teams building autonomous perception systems, enterprise voice AI, and LLMs. At ScaleAI, I supported computer-vision projects involving fine-grained bounding boxes and polygon segmentation for road environments, focusing on rare edge cases like occlusions, adverse weather, and unusual object classes. I also worked on Spanish–English text datasets, performing entity extraction, sentiment labeling, and QA checks to ensure linguistic consistency across dialects. Separately, I assisted a Dialpad-style voice AI team with audio transcription and speaker-turn annotation, improving diarization accuracy for noisy call-center recordings. More recently, I’ve provided LLM evaluation, rubric scoring, preference ranking, and safety assessments, for instruction-following and multilingual generation tasks. These projects taught me to maintain high precision at scale, adapt quickly to new guidelines, and give targeted feedback that genuinely improves model performance.

ExpertEnglishSpanish

Labeling Experience

Spanish–English Multilingual Text Annotation for LLM Training

LabelboxTextEntity Ner ClassificationClassification

Worked on a bilingual dataset used to fine-tune a multilingual language model. Labeled Spanish and English text snippets for named entities, tone, intent, and topical classification. Later transitioned to LLM evaluation tasks, rubric scoring, preference ranking, and harmful-content detection, helping calibrate early instruction-following behaviors. Flagged inconsistencies in dialectal Spanish examples (MX vs. ES), which informed dataset cleanup.

2025

Voice AI Training: Call-Center Audio Transcription & Diarization Support

SuperannotateAudioEmotion RecognitionTranslation Localization

Assisted a voice-AI team by transcribing noisy customer-support calls, tagging sentiment, and labeling speaker turns to improve diarization accuracy. Specialized in challenging acoustic conditions, overlapping speakers, low-fidelity microphones, and accented speech. Helped refine guidelines for distinguishing cross-talk vs. rapid back-and-forth exchanges, reducing rework rates across the team.

2023 - 2024

Autonomous Vehicle Perception Dataset — Urban Object Detection

Scale AIImageBounding BoxPolygon

Annotated and QA-checked tens of thousands of urban driving scenes for an autonomy perception team via ScaleAI. Labeled vehicles, pedestrians, cyclists, traffic signs, lane boundaries, and rare edge-case objects (construction machinery, oddly shaped trailers, occluded pedestrians). Performed segmentation for drivable area mapping and temporal tracking for motion prediction. Provided guideline feedback that led to clarifying rules for nighttime glare and heavy-shadow scenarios.

2022 - 2023

Education

G

Georgetown University

Bachelor of Science, Physics

Bachelor of Science

Not specified

Work History

N

N/A

Contract Engineer

San Francisco

2023 - Present

D

Dialpad

Data Operations Assistant

Remote

2020 - 2022