For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Ghita Dahreddine

Ghita Dahreddine

LLM Evaluation & Prompt Engineering Specialist – French & English

France flagPARIS, France
$25.00/hrIntermediateAppenData Annotation TechLabelbox

Key Skills

Software

AppenAppen
Data Annotation TechData Annotation Tech
LabelboxLabelbox
OneFormaOneForma
RemotasksRemotasks
Scale AIScale AI
TelusTelus
Don't disclose

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
ImageImage
TextText

Top Task Types

Data Collection
Evaluation Rating
Prompt Response Writing SFT
Question Answering
Translation Localization

Freelancer Overview

As a bilingual LLM Evaluation Specialist and Prompt Engineer (French-English), I have contributed to cutting-edge AI training pipelines through high-quality data labeling, prompt crafting, and model response evaluation. I have worked with leading AI platforms on tasks such as ranking outputs, identifying harmful content, testing prompt robustness, and fine-tuning model performance using human-in-the-loop feedback. With a strong foundation in language, logic, and model behavior analysis, I bring both linguistic accuracy and strategic thinking to every project. My ability to operate in both French and English allows me to support multilingual AI development, ensuring models are culturally and contextually accurate across languages.

IntermediateArabicFrenchEnglish

Labeling Experience

Audio Evaluation & Disfluency Labeling (EN/FR)

Don T DiscloseAudioClassificationEmotion Recognition
Evaluated short audio clips and conversational speech samples in English and French, focusing on identifying disfluencies, filler words, mispronunciations, and tone shifts. Tasks included: Detecting and labeling repetitions, hesitations, false starts, and other natural disfluencies Classifying speaker tone, clarity, and intent (e.g. neutral, emotional, uncertain) Reviewing spoken outputs for alignment with transcription and pronunciation norms Flagging unsafe or problematic speech according to ethical and safety standards Providing structured feedback to improve speech data quality for language models and ASR systems Work required linguistic attention to detail, familiarity with phonetic patterns, and the ability to follow strict annotation guidelines.

Evaluated short audio clips and conversational speech samples in English and French, focusing on identifying disfluencies, filler words, mispronunciations, and tone shifts. Tasks included: Detecting and labeling repetitions, hesitations, false starts, and other natural disfluencies Classifying speaker tone, clarity, and intent (e.g. neutral, emotional, uncertain) Reviewing spoken outputs for alignment with transcription and pronunciation norms Flagging unsafe or problematic speech according to ethical and safety standards Providing structured feedback to improve speech data quality for language models and ASR systems Work required linguistic attention to detail, familiarity with phonetic patterns, and the ability to follow strict annotation guidelines.

2025 - 2025
Scale AI

LLM Evaluation and Prompt Engineering (EN/FR)

Scale AITextQuestion AnsweringText Generation
Worked on a range of advanced data labeling and AI training tasks to support the development and alignment of large language models (LLMs). Projects spanned multiple domains including Mathematics, Education, Travel & Transportation, Social Sciences, and Law, with a strong emphasis on safety, multilingual performance, and response quality. Key responsibilities included: Designing both harmful and benign prompts across defined safety categories to test model robustness and identify edge-case failures Conducting pairwise response comparisons to determine which model output was more helpful, accurate, or safe (aligned with RLHF workflows) Evaluating open and closed question answering tasks across technical and general knowledge domains Completing text classification assignments (e.g. labeling tone, topic, risk level, or intent) Performing text summarization, including condensing long responses or user inputs into clear, concise outputs Writing and refining prompt-response pairs for

Worked on a range of advanced data labeling and AI training tasks to support the development and alignment of large language models (LLMs). Projects spanned multiple domains including Mathematics, Education, Travel & Transportation, Social Sciences, and Law, with a strong emphasis on safety, multilingual performance, and response quality. Key responsibilities included: Designing both harmful and benign prompts across defined safety categories to test model robustness and identify edge-case failures Conducting pairwise response comparisons to determine which model output was more helpful, accurate, or safe (aligned with RLHF workflows) Evaluating open and closed question answering tasks across technical and general knowledge domains Completing text classification assignments (e.g. labeling tone, topic, risk level, or intent) Performing text summarization, including condensing long responses or user inputs into clear, concise outputs Writing and refining prompt-response pairs for

2025 - 2025

Education

G

grenoble école de management

Master of Science, corporate finance and investment banking

Master of Science
2020 - 2023
S

Sorbonne University

Bachelor of Science, Mathematics

Bachelor of Science
2017 - 2020

Work History

I

ITH CENTER

Financial Analyst

PARIS
2023 - Present
S

SOCIETE GENERALE

Credit Risk Analyst

PARIS
2021 - 2023