For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Kazem Hamza

Kazem Hamza

AI Conversation Evaluator - Conversational AI

EGYPT flag
al-khankah, Egypt
$20.00/hrEntry LevelLabelboxDon T Disclose

Key Skills

Software

LabelboxLabelbox
Don't disclose

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
TextText

Top Label Types

Classification
Transcription
Relationship
Question Answering
Emotion Recognition
Evaluation Rating
Audio Recording
Text Generation
RLHF
Prompt Response Writing SFT

Freelancer Overview

I have hands-on experience in AI training data creation, labeling, and human evaluation across multiple real-world projects involving speech, audio, and large language models. At Alignerr, I worked on Project Vera – VAD Audio Annotation v2, producing millisecond-level (1 ms) word-aligned transcriptions, labeling speech types (standard speech, acknowledgments, interruptions), and annotating background noise events. I also contributed to Project Human Evals v2.3, where I conducted live, multi-turn audio conversations with AI models in noisy environments to evaluate speech recognition, responsiveness, task completion, context awareness, and emotional empathy in both customer support and companionship scenarios. In addition, through Cypher RLHF at Outlier, I supported LLM improvement via human feedback by writing prompts and performing pairwise evaluations of AI-generated responses. I assessed outputs for truthfulness, clarity, grammatical quality, text structure, relevance, and alignment with user intent, providing consistent, high-quality annotations used for model fine-tuning. Together, these projects demonstrate strong skills in attention to detail, critical evaluation, conversational AI understanding, RLHF workflows, and human-in-the-loop training, with a proven ability to generate high-quality data that improves AI accuracy, reliability, and real-world performance.

Entry LevelEnglishArabic

Labeling Experience

AI Data Annotator & Evaluator

Don T DiscloseTextText GenerationRLHF
Wrote prompts and performed pairwise evaluation of AI-generated responses as part of an RLHF workflow. Compared outputs based on truthfulness, clarity, grammar, text structure, relevance, and overall quality, providing human feedback used to improve large language model accuracy, readability, and alignment with user intent.

Wrote prompts and performed pairwise evaluation of AI-generated responses as part of an RLHF workflow. Compared outputs based on truthfulness, clarity, grammar, text structure, relevance, and overall quality, providing human feedback used to improve large language model accuracy, readability, and alignment with user intent.

2025 - 2025
Labelbox

AI Data Annotator & Evaluator

LabelboxAudioRelationshipQuestion Answering
Conducted live multi-turn audio conversations with AI models in noisy environments to evaluate speech responsiveness, task completion, context awareness, and emotional empathy. Assessed model performance in customer support scenarios (e.g., TV and internet troubleshooting) and companionship interactions to help improve real-world reliability and human-like behavior.

Conducted live multi-turn audio conversations with AI models in noisy environments to evaluate speech responsiveness, task completion, context awareness, and emotional empathy. Assessed model performance in customer support scenarios (e.g., TV and internet troubleshooting) and companionship interactions to help improve real-world reliability and human-like behavior.

2025 - 2025
Labelbox

LLM / Conversational AI Evaluator

LabelboxAudioClassificationTranscription
Performed millisecond-level audio transcription and annotation of user–AI conversations. Labeled each spoken word by speech type (standard speech, acknowledgments, interruptions) and annotated background noise events with precise timestamps to support improvements in speech recognition, turn-taking, and noise robustness for conversational AI models.

Performed millisecond-level audio transcription and annotation of user–AI conversations. Labeled each spoken word by speech type (standard speech, acknowledgments, interruptions) and annotated background noise events with precise timestamps to support improvements in speech recognition, turn-taking, and noise robustness for conversational AI models.

2025 - 2025

Education

H

Helwan University

Bachelor of Science, Electrical Engineering

Bachelor of Science
2019 - 2024

Work History

C

Contractor

Electrical site engineer trainee

Badr city
2023 - 2023