Ayman Amasrour - PhD Research Intern – LIMSET, UM6P

Key Skills

Software

Other

CVAT

Roboflow

Scale AI

Sama

Mindrift

Top Subject Matter

Structural Defects in Construction Materials

Multilingual Sentiment Analysis

Top Data Types

Image

Text

Audio

Top Task Types

Segmentation

Fine-tuning

Bounding Box

Classification

Object Detection

Text Summarization

Text Generation

Data Collection

RLHF

Question Answering

Freelancer Overview

I have direct, hands-on experience in data annotation and AI training data production across both research and real-world applications. As Co-Founder & CEO of Torath Quest — an AI-powered cultural heritage platform — I personally annotated image datasets of the Skala of Essaouira, a historic Moroccan monument, using Roboflow and CVAT to label and localize cannons for an object counting mission. This involved preparing bounding box annotations on real-world heritage site photographs to enable an AI model to automatically detect and count the cannons present on the site — a concrete, applied data labeling task in a culturally significant context. I also conducted PhD-level dataset construction at LIMSET/UM6P under Dr. Hassan Naanani, building the MRCB (Moroccan Rural Construction Buildings) dataset — the first annotated dataset of structural defects on traditional North African constructions — used to fine-tune YOLOv8-seg for crack detection under severe domain shift. Beyond annotation, I have deep expertise across the full AI training data pipeline: at Atlas AI (March 2026–Present), I author technical NLP content and design precise problem statements used to train and evaluate AI systems. I applied for and was reviewed by Mindrift for their Freelance ML Engineer and Data Science Engineer roles, demonstrating expertise in designing non-trivial, verified STEM problems for AI training. My other projects include fine-tuning XLM-RoBERTa across 5 languages + Moroccan Darija (94.01% accuracy, published on Hugging Face), building YuuInsight (NLP platform, 87% accuracy, full MLOps pipeline), and developing PharmaAI — an AI system with OCR, NLP, and predictive ML modules presented at RabHacks 2026. My combination of real dataset construction, image annotation experience, model evaluation, and clear technical writing makes me a strong and well-rounded fit for AI training data work.

IntermediateEnglishFrench

Labeling Experience

Multilingual Sentiment Analysis – XLM-RoBERTa + Darija Project

OtherTextFine Tuning

Fine-tuned XLM-RoBERTa for multilingual sentiment analysis across English, French, Spanish, Arabic, and Moroccan Darija using over 40,000 annotated samples. The data work included gathering, cleaning, annotating, and balancing labeled text for supervised model training. Optimized the dataset for highly accurate sentiment classification on under-resourced languages. • Annotated and reviewed text samples for consistent positive, neutral, and negative sentiment labeling. • Balanced class representation to reduce model bias across multiple languages and dialects. • Managed preprocessing, data cleaning, and validation of the final curated dataset. • Published the sentiment model and documented the labeling schema for reproducibility.

2026 - Present

PhD Research Intern – LIMSET, UM6P

OtherImageSegmentation

Built the first annotated dataset (MRCB) of structural defects on traditional North African constructions, focusing on image segmentation and object detection of cracks and defects. Led fine-tuning of YOLOv8-seg for domain-shifted crack detection and benchmarking with SAM zero-shot models. Extended the defect annotation pipeline to composite pressure vessel inspection for broader safety applications in the construction industry. • Labeled thousands of images of rural Moroccan structures and heritage buildings. • Applied segmentation and object detection to differentiate between crack types and material heterogeneity. • Managed data labeling process and quality control for multi-domain defect annotation. • Used YOLOv8-seg and Segment Anything Model (SAM) as core annotation and benchmarking tools.

2026 - Present

Education

I

Ibn Abbad High School

International Baccalaureate, Mathematical Sciences

International Baccalaureate

2022 - 2023

E

EMINES – School of Industrial Management, UM6P

Bachelor of Science, Industrial Management and Engineering

Bachelor of Science

2022

Work History

F

First Lego League

Coach

N/A

2026 - Present

D

Digrow

Data Science Intern

N/A

2026 - Present