Abdelfattah Ali - Data Modeler - Machine Learning Pipelines

Key Skills

Software

Labelbox

Label Studio

Scale AI

Snorkel AI

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code Programming

Image

Text

Video

Top Label Types

Bounding Box

Data Collection

Object Detection

Prompt Response Writing SFT

Text Generation

Tracking

Freelancer Overview

I am a data modeling and AI training data specialist with extensive experience designing and implementing end-to-end data labeling frameworks, annotation schemas, and quality validation processes across healthcare, insurance, telecom, and logistics domains. My expertise spans developing ML-ready data pipelines, managing large-scale training datasets, and ensuring high-quality, bias-mitigated labeled data using tools like Label Studio, Amazon SageMaker Ground Truth, and active learning frameworks. I have led cross-functional teams to architect robust data governance strategies, integrate metadata management with labeling standards, and optimize workflows for supervised machine learning projects. My strong background in SQL, Snowflake, PowerBI, and data modeling tools such as ERWIN and PowerDesigner enables me to deliver reliable, compliant, and actionable datasets that accelerate AI and analytics initiatives.

ExpertEnglish

Labeling Experience

Enterprise ML Data Labeling & Governance Framework – Transportation & Logistics

LabelboxImageBounding BoxText Generation

I led the design and implementation of enterprise-scale data labeling and governance frameworks to support supervised machine learning and LLM training within transportation and logistics. My role involved building practical annotation schemas, entity taxonomies, and classification standards, and working closely with data science teams to create clean, reliable ML-ready datasets supported by clear guidelines and strong inter-annotator agreement (IAA) metrics. I developed automated validation pipelines to monitor label accuracy, bias detection, and dataset integrity, while integrating metadata management to ensure full traceability and compliance. Using Label Studio and Amazon SageMaker Ground Truth, I managed scalable annotation workflows, supported active learning, enabled human-in-the-loop refinement, and contributed to RLHF-style evaluation and fine-tuning efforts. As a result, we significantly improved training data quality, reduced labeling inconsistencies, and strengthened over

2024 - 2025

Education

U

University of California

Doctor of Science, Computer Science

Doctor of Science

2011 - 2011

Work History

A

Amtrak

Data Modeler/ML Data Specialist

Washington

2024 - Present

E

Estes Express Lines

Solution Architect/Data Modeler

Richmond

2020 - 2024