For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Somia Mazzouzi

Somia Mazzouzi

"LLM Evaluation in English, Arabic & Frensh. Also, Image & Audio annotator

Algeria flagTlemcen, Algeria
$14.00/hrIntermediateAppenClickworkerLabelbox

Key Skills

Software

AppenAppen
ClickworkerClickworker
LabelboxLabelbox
LionbridgeLionbridge
OneFormaOneForma
Internal/Proprietary Tooling
Other
CrowdSourceCrowdSource

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
ImageImage
TextText

Top Task Types

Audio Recording
Bounding Box
Classification
Entity Ner Classification
Evaluation Rating

Freelancer Overview

I am an experienced Data Annotation Specialist with a strong background in linguistic data labeling and AI training across multilingual datasets. Over the past few years, I’ve worked on high-volume projects involving text, audio, and image annotation for companies like Samsung Technologies, XPeng Motors, Lionbridge, and Romoter. My tasks included annotating over 20,000 French texts and audios on Samsung’s internal CAS2 platform, as well as labeling English and French queries for in-car control systems based on NLU guidelines. I also annotated Arabic texts for Lionbridge, expanding the multilingual scope of my work, and performed image annotation for Romoter ensuring high accuracy and consistency in object detection tasks. Additionally, I annotated 31,172 queries for XPeng Motors contributing significantly to their in-car systems’ development. With a Master’s degree in Linguistics, I bring a deep understanding of language structure and meaning, which enhances my precision in labeling and edge case identification. I have hands-on experience with tools like Excel, CAS2, and Labelbox, and I consistently follow complex guidelines while meeting tight deadlines. My ability to work across multiple languages and data types** makes me a strong asset for AI training and model refinement projects.

IntermediateArabicFrenchEnglish

Labeling Experience

CrowdSource

Multilingual Text and Audio Annotation for Voice Assistant– Samsung Technologies

CrowdsourceTextEntity Ner ClassificationClassification
Contributed to a large-scale AI training project for Samsung Technologies by annotating and labeling over 20,000 French texts and audio samples on their internal crowdsourcing platform (CAS2). Tasks involved tagging intents, semantic roles, and dialogue actions to support natural language understanding (NLU) for Samsung’s voice assistant systems. Followed complex annotation guidelines and maintained a high standard of accuracy and consistency. Regularly reviewed edge cases and provided feedback to improve annotation quality and system training outcomes.

Contributed to a large-scale AI training project for Samsung Technologies by annotating and labeling over 20,000 French texts and audio samples on their internal crowdsourcing platform (CAS2). Tasks involved tagging intents, semantic roles, and dialogue actions to support natural language understanding (NLU) for Samsung’s voice assistant systems. Followed complex annotation guidelines and maintained a high standard of accuracy and consistency. Regularly reviewed edge cases and provided feedback to improve annotation quality and system training outcomes.

2024 - 2024

Multilingual Data Annotation for NLU and Autonomous Driving Systems

OtherTextEntity Ner Classification
Worked on a multilingual NLU annotation project for XPeng Motors focused on in-car voice control systems. Annotated and labeled French and English user queries across approximately 8 datasets, and about 31172 queries, following detailed NLU guidelines. Tasks involved identifying and tagging user intents and entities related to car control functions (e.g., air conditioning, navigation, media). Maintained high annotation accuracy and consistency while flagging ambiguous cases to support guideline refinement. This project contributed to improving the performance of voice-enabled features in autonomous vehicle systems.

Worked on a multilingual NLU annotation project for XPeng Motors focused on in-car voice control systems. Annotated and labeled French and English user queries across approximately 8 datasets, and about 31172 queries, following detailed NLU guidelines. Tasks involved identifying and tagging user intents and entities related to car control functions (e.g., air conditioning, navigation, media). Maintained high annotation accuracy and consistency while flagging ambiguous cases to support guideline refinement. This project contributed to improving the performance of voice-enabled features in autonomous vehicle systems.

2024 - 2024

Education

U

University Abu Bakr Belkaid

Master's Degree, Linguistics

Master's Degree
2020 - 2022
U

University Abu Bakr Belkaid

Bachelor's Degree, English Language & Literature

Bachelor's Degree
2017 - 2020

Work History

T

TransPro

Translator & Language Specialist

N/A
2021 - Present