Multilingual Speech Recognition & Acoustic Emotion Mapping
I worked on a large-scale data annotation project designed to improve the Natural Language Understanding (NLU) of a global virtual assistant. My primary responsibility was the verbatim transcription of varied acoustic data, including noisy environments, cross-talk, and heavy regional accents. I performed Phonetic Labeling and tagged Acoustic Events (e.g., background noise, non-speech vocalizations) to help the model distinguish between user commands and environmental interference. Additionally, I contributed to a specialized Emotion Recognition layer, where I categorized the speaker's sentiment (e.g., frustrated, satisfied, neutral) and urgency levels. This data was used to refine the model's ability to adjust its tone of voice in response to user mood. By adhering to strict orthographic guidelines and maintaining a high throughput, I helped decrease the Word Error Rate (WER) for the client's localized speech models.