Audio Transcription & Emotion Annotation for Speech Recognition Models
Participated in a large-scale audio annotation project focused on improving automatic speech recognition (ASR) and conversational AI systems. The project involved transcribing and labeling over 15,000 hours of multilingual audio recordings, including customer service calls, virtual assistant interactions, and spontaneous conversational speech. Responsibilities included verbatim transcription with speaker diarization, timestamp alignment, noise identification, and sound event classification. I also annotated emotional tone categories such as neutral, positive, negative, frustrated, and excited to support emotion recognition model training. Special attention was given to handling background noise, overlapping speech, regional accents, and low-audio clarity recordings to ensure dataset consistency. Strict quality assurance processes were followed, including double-pass review systems, random batch audits, word error rate (WER) checks, and compliance with standardized transcription.