For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Colinz Yegon

Colinz Yegon

Expert in AI computer vision data labeling for self-driving cars

Kenya flagNairobi, Kenya
$8.00/hrExpertCloudfactoryCrowdsourceCVAT

Key Skills

Software

CloudFactoryCloudFactory
CrowdSourceCrowdSource
CVATCVAT
Data Annotation TechData Annotation Tech
Img Lab
LabelboxLabelbox
LabelImgLabelImg
OpenCV AI Kit (OAK)OpenCV AI Kit (OAK)
SamaSama
Internal/Proprietary Tooling
AppenAppen

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
ImageImage
TextText

Top Task Types

Bounding Box
Classification
Computer Programming Coding
Segmentation
Translation Localization

Freelancer Overview

I am a skilled data annotation specialist with over two years of experience in speech transcription, linguistic data labeling, and AI training across English and Swahili datasets. My work has involved transcribing and cleaning audio data for use in training speech recognition models, with a strong emphasis on phonetic accuracy, grammatical correction, and adherence to strict transcription guidelines. I have collaborated on projects through platforms like TranscribeMe, Rev, and Sama, focusing on non-native English speech and multilingual corpora. What sets me apart is my bilingual proficiency (Swahili-English), advanced linguistic training, and hands-on experience with large-scale AI data projects. I possess a keen ear for diverse accents, strong attention to detail, and a deep understanding of how high-quality labeled data directly impacts model performance. Whether working independently or in a team setting, I consistently meet quality benchmarks and tight deadlines.

ExpertSwahiliFrenchGermanEnglish

Labeling Experience

Appen

English & Swahili Audio Transcription and Annotation – Appen & TranscribeMe

AppenAudioClassificationQuestion Answering
Worked on multiple projects transcribing and annotating English and Swahili speech datasets for training AI speech recognition and NLP systems. Tasks involved listening to short voice recordings, accurately transcribing content, removing disfluencies, correcting grammar, and classifying speaker accents and background noise levels. I followed detailed transcription and annotation guidelines to ensure consistency across large datasets. Quality control involved peer review and automated QA systems, with a maintained accuracy rate of over 98%. Some tasks included rating ASR model outputs for fluency and comprehension. Project size included processing over 1,000 audio clips weekly, totaling over 30,000 clips throughout the engagement.

Worked on multiple projects transcribing and annotating English and Swahili speech datasets for training AI speech recognition and NLP systems. Tasks involved listening to short voice recordings, accurately transcribing content, removing disfluencies, correcting grammar, and classifying speaker accents and background noise levels. I followed detailed transcription and annotation guidelines to ensure consistency across large datasets. Quality control involved peer review and automated QA systems, with a maintained accuracy rate of over 98%. Some tasks included rating ASR model outputs for fluency and comprehension. Project size included processing over 1,000 audio clips weekly, totaling over 30,000 clips throughout the engagement.

2022 - 2022

Education

K

Kenyatta University

Bachelor of Arts in Linguistics, Linguistics and Language Studies

Bachelor of Arts in Linguistics
2015 - 2019
K

Kenyatta University

Bachelor of Arts in Linguistics, Linguistics and Language Studies

Bachelor of Arts in Linguistics
2015 - 2019

Work History

S

Sama AI

Data Annotation Assistant

Nairobi
2020 - Present