For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
S
Sam Biwott

Sam Biwott

Research Assistant (Language Project)

Kenya flagNairobi, Kenya
$6.00/hrExpert

Key Skills

Software

No software listed

Top Subject Matter

Swahili speech/language data
Speech dataset development for machine learning
Digital content localization and translation for Swahili

Top Data Types

AudioAudio
TextText

Top Task Types

TranscriptionTranscription
SegmentationSegmentation

Freelancer Overview

Research Assistant (Language Project). Brings 6+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Arts, Egerton University. AI-training focus includes data types such as Audio and Text and labeling workflows including Transcription, Segmentation, and Translation.

ExpertEnglish

Labeling Experience

Translation & Localization Project

Text
Labeled and adapted English digital content for accurate Swahili localization in digital platforms. Ensured that translations maintained high contextual, cultural, and linguistic accuracy suitable for target audiences. Conducted consistency checks and refined localizations for machine learning and product optimization. • Translated and localized content for digital AI-driven projects • Labeled linguistic attributes and contextual features • Performed accuracy checks to ensure intended meaning • Collaborated for contextual optimization in ML/NLP settings

Labeled and adapted English digital content for accurate Swahili localization in digital platforms. Ensured that translations maintained high contextual, cultural, and linguistic accuracy suitable for target audiences. Conducted consistency checks and refined localizations for machine learning and product optimization. • Translated and localized content for digital AI-driven projects • Labeled linguistic attributes and contextual features • Performed accuracy checks to ensure intended meaning • Collaborated for contextual optimization in ML/NLP settings

2023 - Present

Swahili Speech Dataset Development (Project)

AudioSegmentation
Developed structured datasets for Swahili automatic speech recognition through audio file segmentation and labeling. Enhanced machine learning models by refining data annotation techniques and improving audio dataset quality. Collaborated with linguists to optimize dataset representativeness across dialects and scenarios. • Segmented Swahili audio into usable training utterances • Labeled speech segments for speaker, region, and content • Focused on high-fidelity segmentation and metadata assignment • Supported iterative data quality assurance processes

Developed structured datasets for Swahili automatic speech recognition through audio file segmentation and labeling. Enhanced machine learning models by refining data annotation techniques and improving audio dataset quality. Collaborated with linguists to optimize dataset representativeness across dialects and scenarios. • Segmented Swahili audio into usable training utterances • Labeled speech segments for speaker, region, and content • Focused on high-fidelity segmentation and metadata assignment • Supported iterative data quality assurance processes

2023 - Present

Research Assistant (Language Project)

AudioTranscription
Contributed to collecting, transcribing, and annotating Swahili speech data for linguistic and AI research. Processed regional speech samples and prepared datasets for use in machine learning models. Conducted data cleaning to ensure data integrity and optimized dataset quality throughout the project. • Collected and transcribed a wide variety of Swahili dialects • Labeled and segmented audio for speech recognition • Ensured transcription accuracy exceeding 98% • Supported dataset preparation for academic and AI uses

Contributed to collecting, transcribing, and annotating Swahili speech data for linguistic and AI research. Processed regional speech samples and prepared datasets for use in machine learning models. Conducted data cleaning to ensure data integrity and optimized dataset quality throughout the project. • Collected and transcribed a wide variety of Swahili dialects • Labeled and segmented audio for speech recognition • Ensured transcription accuracy exceeding 98% • Supported dataset preparation for academic and AI uses

2023 - Present

Education

E

Egerton University

Bachelor of Arts, Education (Kiswahili)

Bachelor of Arts
Not specified

Work History

U

University Of Nairobi

Research Assistant

Nairobi
2023 - Present
M

Makini School

Kiswahili Tutor

Nairobi
2021 - 2022