For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Y G

Y G

AI Training/Data Labeling for Multilingual Speech Translation Project

EGYPT flag
Cairo, Egypt
$30.00/hrIntermediateRoboflowData Annotation TechImg Lab

Key Skills

Software

RoboflowRoboflow
Data Annotation TechData Annotation Tech
Img Lab
Label StudioLabel Studio
MercorMercor
OpenCV AI Kit (OAK)OpenCV AI Kit (OAK)

Top Subject Matter

Speech Recognition and Translation
Domain-Specific NLP Question Answering
Text Summarization/NLP

Top Data Types

AudioAudio
TextText
ImageImage

Top Task Types

Question Answering
Text Summarization
Classification
Segmentation
RLHF
Object Detection
Bounding Box
Data Collection
Transcription

Freelancer Overview

AI Training/Data Labeling for Multilingual Speech Translation Project. Brings 6+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, London South Bank University / British University in Egypt (2024). AI-training focus includes data types such as Audio, Text, and Image and labeling workflows including Translation, Localization, and Question Answering.

IntermediateArabicEnglish

Labeling Experience

AI Research Assessment – Object Segmentation & Annotation Pipeline

ImageSegmentation
Performed image segmentation annotation for a computer vision task by extracting precise object masks from a provided dataset. Leveraged the Segment Anything Model (SAM) to generate high-quality segmentation masks and refine them for a specific target object across images. Extracted the polygon coordinates of each segmented object and structured the annotations into YAML files for downstream training pipelines. Ensured annotation consistency, validated mask accuracy, and prepared the dataset for integration with modern segmentation and object detection frameworks.

Performed image segmentation annotation for a computer vision task by extracting precise object masks from a provided dataset. Leveraged the Segment Anything Model (SAM) to generate high-quality segmentation masks and refine them for a specific target object across images. Extracted the polygon coordinates of each segmented object and structured the annotations into YAML files for downstream training pipelines. Ensured annotation consistency, validated mask accuracy, and prepared the dataset for integration with modern segmentation and object detection frameworks.

2025 - 2025

Computer Vision challenge

ImageBounding Box
Annotated image datasets for an object detection task using bounding box labeling. Used Roboflow to efficiently label objects across a large set of images while following strict annotation guidelines. Ensured accurate bounding boxes, handled edge cases such as occlusions and overlapping objects, and performed quality checks to maintain dataset consistency. The labeled dataset was prepared for training computer vision models using common formats such as YOLO.

Annotated image datasets for an object detection task using bounding box labeling. Used Roboflow to efficiently label objects across a large set of images while following strict annotation guidelines. Ensured accurate bounding boxes, handled edge cases such as occlusions and overlapping objects, and performed quality checks to maintain dataset consistency. The labeled dataset was prepared for training computer vision models using common formats such as YOLO.

2025 - 2025

AI Data Labeling for Text Summarization Project

TextText Summarization
I developed an NLP model for text summarization in Arabic and English by fine-tuning the T5 transformer, which involved constructing labeled datasets and validating summary outputs. I performed text annotation, summary alignment, and model output evaluation to enhance summarization accuracy. The work contributed directly to supervised learning pipelines in NLP. • Engaged in manual summarization and gold-standard dataset creation • Aligned input texts with reference summaries for supervised fine-tuning • Evaluated summarization precision and consistency • Assisted in metadata labeling for quality assessment

I developed an NLP model for text summarization in Arabic and English by fine-tuning the T5 transformer, which involved constructing labeled datasets and validating summary outputs. I performed text annotation, summary alignment, and model output evaluation to enhance summarization accuracy. The work contributed directly to supervised learning pipelines in NLP. • Engaged in manual summarization and gold-standard dataset creation • Aligned input texts with reference summaries for supervised fine-tuning • Evaluated summarization precision and consistency • Assisted in metadata labeling for quality assessment

2024 - 2024

Multilingual Audio Transcription Dataset for Speech Recognition

AudioTranscription
Collected and prepared a multilingual audio dataset from Youtube , facebook and X to support my graduation project in speech recognition. Audio samples were processed and automatically transcribed using Whisper. The dataset included recordings in English, German, Spanish, and Arabic (both Modern Standard Arabic and Egyptian dialect). Generated and validated transcription outputs to ensure alignment between speech and text, creating structured labeled data suitable for training and evaluating speech-to-text and NLP models.

Collected and prepared a multilingual audio dataset from Youtube , facebook and X to support my graduation project in speech recognition. Audio samples were processed and automatically transcribed using Whisper. The dataset included recordings in English, German, Spanish, and Arabic (both Modern Standard Arabic and Egyptian dialect). Generated and validated transcription outputs to ensure alignment between speech and text, creating structured labeled data suitable for training and evaluating speech-to-text and NLP models.

2023 - 2024

AI Data Labeling for Quran Question Answering AI Model

TextQuestion Answering
I trained and fine-tuned transformer-based models (RoBERTa, AraBERT) for question answering specific to Quranic content, which involved annotating question-answer pairs and validating model outputs. My tasks included curating datasets, labeling relevant answers, and implementing evaluation routines to ensure dataset quality. This project contributed to improving domain-specific NLP models by providing accurately labeled data. • Annotated question-answer pairs for AI fine-tuning in a religious context • Performed validation and review of model-predicted answers • Ensured dataset integrity with systematic accuracy checks • Provided domain expertise to enhance labeling quality

I trained and fine-tuned transformer-based models (RoBERTa, AraBERT) for question answering specific to Quranic content, which involved annotating question-answer pairs and validating model outputs. My tasks included curating datasets, labeling relevant answers, and implementing evaluation routines to ensure dataset quality. This project contributed to improving domain-specific NLP models by providing accurately labeled data. • Annotated question-answer pairs for AI fine-tuning in a religious context • Performed validation and review of model-predicted answers • Ensured dataset integrity with systematic accuracy checks • Provided domain expertise to enhance labeling quality

2023 - 2024

Education

L

London South Bank University / British University in Egypt

Bachelor of Science, Informatics and Computer Science, Artificial Intelligence Specialization

Bachelor of Science
2020 - 2024

Work History

F

Freelancing

Web / AI developer Freelancer (SaaS)

Cairo
2024 - 2025
N

National Bank of Egypt

Network Engineer Intern

Cairo
2022 - 2022