Y G - AI Training/Data Labeling for Multilingual Speech Translation Project

Key Skills

Software

Roboflow

Data Annotation Tech

Img Lab

Label Studio

Mercor

OpenCV AI Kit (OAK)

Top Subject Matter

Speech Recognition and Translation

Domain-Specific NLP Question Answering

Text Summarization/NLP

Top Data Types

Audio

Text

Image

Top Task Types

Question Answering

Text Summarization

Classification

Segmentation

RLHF

Object Detection

Bounding Box

Data Collection

Transcription

Freelancer Overview

AI Training/Data Labeling for Multilingual Speech Translation Project. Brings 6+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, London South Bank University / British University in Egypt (2024). AI-training focus includes data types such as Audio, Text, and Image and labeling workflows including Translation, Localization, and Question Answering.

IntermediateArabicEnglish

Labeling Experience

AI Research Assessment – Object Segmentation & Annotation Pipeline

ImageSegmentation

Performed image segmentation annotation for a computer vision task by extracting precise object masks from a provided dataset. Leveraged the Segment Anything Model (SAM) to generate high-quality segmentation masks and refine them for a specific target object across images. Extracted the polygon coordinates of each segmented object and structured the annotations into YAML files for downstream training pipelines. Ensured annotation consistency, validated mask accuracy, and prepared the dataset for integration with modern segmentation and object detection frameworks.

2025 - 2025

Computer Vision challenge

ImageBounding Box

Annotated image datasets for an object detection task using bounding box labeling. Used Roboflow to efficiently label objects across a large set of images while following strict annotation guidelines. Ensured accurate bounding boxes, handled edge cases such as occlusions and overlapping objects, and performed quality checks to maintain dataset consistency. The labeled dataset was prepared for training computer vision models using common formats such as YOLO.

2025 - 2025

AI Data Labeling for Text Summarization Project

TextText Summarization

I developed an NLP model for text summarization in Arabic and English by fine-tuning the T5 transformer, which involved constructing labeled datasets and validating summary outputs. I performed text annotation, summary alignment, and model output evaluation to enhance summarization accuracy. The work contributed directly to supervised learning pipelines in NLP. • Engaged in manual summarization and gold-standard dataset creation • Aligned input texts with reference summaries for supervised fine-tuning • Evaluated summarization precision and consistency • Assisted in metadata labeling for quality assessment

2024 - 2024

Multilingual Audio Transcription Dataset for Speech Recognition

AudioTranscription

Collected and prepared a multilingual audio dataset from Youtube , facebook and X to support my graduation project in speech recognition. Audio samples were processed and automatically transcribed using Whisper. The dataset included recordings in English, German, Spanish, and Arabic (both Modern Standard Arabic and Egyptian dialect). Generated and validated transcription outputs to ensure alignment between speech and text, creating structured labeled data suitable for training and evaluating speech-to-text and NLP models.

2023 - 2024

AI Data Labeling for Quran Question Answering AI Model

TextQuestion Answering

I trained and fine-tuned transformer-based models (RoBERTa, AraBERT) for question answering specific to Quranic content, which involved annotating question-answer pairs and validating model outputs. My tasks included curating datasets, labeling relevant answers, and implementing evaluation routines to ensure dataset quality. This project contributed to improving domain-specific NLP models by providing accurately labeled data. • Annotated question-answer pairs for AI fine-tuning in a religious context • Performed validation and review of model-predicted answers • Ensured dataset integrity with systematic accuracy checks • Provided domain expertise to enhance labeling quality

2023 - 2024

Education

L

London South Bank University / British University in Egypt

Bachelor of Science, Informatics and Computer Science, Artificial Intelligence Specialization

Bachelor of Science

2020 - 2024

Work History

F

Freelancing

Web / AI developer Freelancer (SaaS)

Cairo

2024 - 2025

N

National Bank of Egypt

Network Engineer Intern

Cairo

2022 - 2022