Transcription & Text Normalization
Worked on a multilingual audio annotation project focused on training speech recognition and language models using English and Swahili datasets. Responsibilities included transcribing audio recordings into accurate text while preserving context, speaker intent, and linguistic nuances. Handled diverse audio inputs such as conversations, interviews, and informal speech with varying accents, background noise, and speaking speeds. Performed text normalization by standardizing grammar, punctuation, and formatting to ensure consistency and usability for AI training. Applied speech-to-text labeling techniques to align audio with corresponding transcripts, improving model accuracy in real-world applications. Maintained high accuracy and attention to detail while following strict transcription and annotation guidelines, and conducted quality checks to ensure clean, reliable datasets for model training.