Micheal Samaniego - AI Research Assistant – Voice & Data Annotation

Key Skills

Software

Labelbox

SuperAnnotate

Prodigy

CVAT

Top Subject Matter

Voice recording

multimodal data annotation

computer vision

Top Data Types

Audio

Image

Text

Document

Top Task Types

Audio Recording

Classification

Freelancer Overview

AI Research Assistant – Voice & Data Annotation. Brings 8+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Labelbox, SuperAnnotate, and Prodigy. Education includes Doctor of Philosophy, Columbia University (2021) and Master of Arts, University of Houston (2014). AI-training focus includes data types such as Audio and labeling workflows including Audio Recording and Classification.

ExpertEnglishSwahili

Labeling Experience

AI Research Assistant – Voice & Data Annotation

LabelboxAudioAudio Recording

Produced over 500 high-quality audio clips describing images for use in multimodal AI training datasets. Ensured recordings were natural, contextually accurate, and semantically precise, adhering to annotation guidelines and maintaining consistent voice tone. Developed annotation style documents, partnered with QA teams, contributed to pilot multimodal captioning projects, and delivered datasets in standardized formats for supervised ML training. • Reduced QA corrections by 35% through process improvement. • Collaborated with engineers to ensure dataset compatibility with pipelines. • Created onboarding materials for new annotators to ensure compliance. • Recognized for attention to detail, reliability, and meeting weekly quotas.

2023 - Present

Graduate Research Associate – Computational Linguistics

LabelboxAudioClassification

Conducted advanced research on semantics, discourse, and speech clarity in human–AI systems, designing and executing annotation experiments to compare AI-generated voice outputs and human-recorded speech. Built custom datasets of annotated voice recordings paired with image descriptions for exploratory AI projects in multimodal learning. Collaborated on NLP preprocessing pipelines, trained students, and refined data quality metrics. • Published two peer-reviewed journal articles on discourse clarity in AI-generated outputs. • Presented findings at high-profile research venues. • Helped create benchmarks for dataset validation and reproducibility. • Supported collaborative research projects through annotation training.

2021 - 2022

Education

C

Columbia University

Doctor of Philosophy, Linguistics and Computational Communication

Doctor of Philosophy

2017 - 2021

U

University of Houston

Master of Arts, English – Language and Digital Communication

Master of Arts

2012 - 2014

Work History

C

Columbia University

Graduate Research Associate

New York

2021 - 2022

N

New York University

Graduate Teaching Fellow

New York

2015 - 2017