French Text & Audio Data Labeling for LLM Training
Participated in multiple data annotation projects for LLM development, focused on French language tasks. Tasks included: Annotating user intents, entities, and emotional tones in French conversations. Evaluating AI-generated responses for coherence, fluency, and alignment with prompts. Writing high-quality prompt-response pairs in French for supervised fine-tuning (SFT). Transcribing French audio data and applying phonemic labels for speech recognition training. QA reviewing other contributors’ work to ensure consistency and high labeling accuracy. The projects ranged from small batch datasets (under 1,000 items) to large-scale corpora exceeding 50,000 samples. Maintained over 95% quality rating across all assigned tasks and adapted quickly to evolving guidelines and tool updates.