Alice Mutheu - AI Training Specialist - Data Annotation

Key Skills

Software

Labelbox

Supervisely

Doccano

Top Subject Matter

No subject matter listed

Top Data Types

Image

Text

Top Label Types

Object Detection

Segmentation

Classification

Bounding Box

Point Key Point

Tracking

Freelancer Overview

I am an experienced AI Training Specialist with over 3 years of hands-on expertise in data labeling and annotation for image, video, audio, and text datasets. My work has supported diverse AI domains, including computer vision (object detection, facial recognition, motion tracking) and natural language processing (speech-to-text, sentiment analysis, chatbot training). I am skilled in using leading annotation tools such as Labelbox, CVAT, Supervisely, V7 Darwin, Scale AI, Amazon SageMaker Ground Truth, Google Cloud Data Labeling, Audacity, ELAN, Doccano, and Prodigy. I excel at ensuring high-quality, accurate training data through rigorous quality assurance and dataset validation, and I have a strong track record of collaborating with cross-functional teams to refine labeling guidelines and improve model performance. My commitment to precision, efficiency, and data confidentiality helps drive continuous improvement in AI workflows.

ExpertEnglishJapaneseSpanishFrenchPortuguese

Labeling Experience

AI Training Specialist

LabelboxImageObject Detection

I labeled and annotated image and video datasets for various computer vision applications, including object detection, facial recognition, and motion tracking. I performed audio transcription and speech annotation for natural language processing and voice-recognition systems. I conducted thorough dataset quality checks and collaborated with machine learning engineers to continually refine our labeling approach. • Managed high-volume annotation projects using tools such as Labelbox, CVAT, and Supervisely • Achieved consistent annotation accuracy rates above 98% • Delivered labeling feedback and assisted in model iteration • Supported team knowledge sharing and documentation efforts

2023

Multimodal AI Dataset Annotation for Computer Vision & Speech Recognition Systems

LabelboxImageBounding BoxPoint Key Point

Led and contributed to large-scale multimodal data annotation projects supporting computer vision, NLP, and speech recognition models. For computer vision datasets, performed bounding box, polygon, semantic segmentation, cuboid annotation, object detection, and multi-object tracking across 50,000+ images and 5,000+ video frames. Maintained strict annotation guidelines to ensure consistency and achieved over 98% quality accuracy during QA audits. For NLP and LLM projects, conducted Named Entity Recognition (NER), text classification, sentiment analysis, prompt-response evaluation (SFT), and fine-tuning support. Annotated over 100,000+ text entries for chatbot and generative AI optimization. For audio datasets, completed transcription, timestamping, speaker diarization, and emotion recognition tasks for voice assistant training. Processed 2,000+ hours of audio data while adhering to confidentiality and data security standards. Followed structured QA workflows including peer review, d

2022 - 2024

Data Annotation Specialist

SuperviselyImageSegmentation

I annotated large-scale datasets including images, videos, and text corpora with bounding boxes, polygons, and semantic segmentation labels. I managed dataset organization, version control, and supported AI development through dataset validation. I contributed to AI testing by validating model predictions and ensuring data quality. • Utilized tools such as Supervisely, CVAT, and Scale AI for complex annotation tasks • Maintained structured data labeling workflows and documentation • Participated in cross-functional data preparation meetings • Improved project efficiency through meticulous dataset management

2022 - 2023

AI Data Labeling Assistant

DoccanoTextClassification

I labeled text data for sentiment analysis and chatbot training projects, ensuring coherent, high-quality training data. I transcribed and timestamped audio recordings to build structured speech datasets. I maintained strict data confidentiality and adhered to established annotation standards. • Used Doccano and Prodigy for text and NLP annotation workflows • Supported the preparation of training materials and workflow guides • Participated in ongoing quality assurance processes • Collaborated with peers to review and enhance workflow guidelines

2021 - 2022

Education

U

University of Nairobi

Bachelor of Science, Information Technology

Bachelor of Science

2017 - 2021

P

Precious Blood Secondary School

Kenya Certificate of Secondary Education, General Secondary Education

Kenya Certificate of Secondary Education

2013 - 2016

Work History

D

Digital Content Solutions Ltd

AI Content Analyst & Quality Assurance Specialist

Massachusetts

2022 - 2025