Waleed Ashraf - Egyptian National ID Card Reading System - Data Labeling & OCR Automation

Key Skills

Software

AWS SageMaker

Top Subject Matter

Identity Document OCR and Information Extraction

Anpr Domain Expertise

Traffic Monitoring

Top Data Types

Document

Image

Text

Top Task Types

Classification

Object Detection

Question Answering

Freelancer Overview

Egyptian National ID Card Reading System - Data Labeling & OCR Automation. Brings 4+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include OpenCV, EasyOCR, and Tesseract. Education includes Certificate in Data Science, EpsilonAI Institute (2022) and Bachelor of Science, October 6 University (2019). AI-training focus includes data types such as Document, Image, and Text and labeling workflows including Classification, Object Detection, and Question Answering.

ExpertEnglish

Labeling Experience

RAG Q&A Systems – Labeled Data for LLMs and Context Grounding

TextQuestion Answering

Designed and deployed Retrieval-Augmented Generation (RAG) systems using LLMs and prompt engineering to improve information retrieval accuracy for marketing and legal document workflows. Labeled and curated question–answer pairs, context passages, and ground truth answers for optimizing RAG model performance and hallucination reduction. Developed pipelines to support context retrieval evaluation and LLM output scoring. • Labeled large document datasets with gold-standard Q&A pairs for model tuning. • Created context-chunk and reference mapping for answer verification. • Applied evaluation metrics for RAG model assessment using labeled data. • Automated data labeling workflows via custom scripts and internal tooling.

2025 - 2026

Arabic Text Analysis System – NLP Data Annotation and Classification

TextClassification

Built an Arabic text analysis system involving data labeling for tasks such as Sentiment Analysis, Named Entity Recognition (NER), Text Classification, and Summarization. Fine-tuned transformer models on labeled datasets to improve performance across Arabic dialects. Applied consistent annotation guidelines to produce reliable and diverse benchmark data. • Created and curated labeled Arabic text datasets for NLP model training. • Annotated texts with sentiment, named entities, categories, and summaries following task-specific protocols. • Supported fine-tuning of AraBERT and other transformer architectures. • Enabled benchmarking and cross-dialect accuracy analysis via structured labeling workflows.

2023 - 2025

Disease Detection from Medical Images - Annotation & Training Data Preparation

ImageClassification

Built and labeled deep learning datasets for disease detection from medical images, including X-rays and ultrasound scans. Developed CNNs using augmented and annotated datasets to improve classification performance and support diagnostic prediction. Applied data augmentation and labeling protocols suitable for medical imaging applications. • Selected and annotated relevant regions within medical images for classification. • Applied standardized protocols to ensure labeling quality and reproducibility. • Managed class imbalance and dataset partitioning during model training. • Labeled and processed images to support clinical decision support systems.

2023 - 2025

Vehicle License Plate Recognition – Data Labeling & Model Training

ImageObject Detection

Developed a real-time Automatic Number Plate Recognition (ANPR) system for detecting and reading vehicle license plates from images and video streams. Used YOLOv8 for bounding box annotation and EasyOCR for text extraction, enabling training, validation, and evaluation of plate localization and reading models. Optimized data labeling workflow for multi-language (Arabic/Latin) plate recognition and real-time deployment scenarios. • Labeled license plate regions within images and video frames for model training. • Prepared ground truth data for plate recognition and text extraction tasks. • Conducted model evaluation and error analysis based on labeled data. • Processed diverse image and video datasets for traffic monitoring and parking management.

2023 - 2025

Egyptian National ID Card Reading System - Data Labeling & OCR Automation

DocumentClassification

Developed an OCR-based system to extract structured data from Egyptian National ID cards, including name, ID number, address, date of birth, and gender. Designed and implemented multi-stage data pipelines combining object detection, field segmentation, and Arabic OCR for accurate field extraction. Integrated the labeled output into an automated identity verification workflow processing thousands of documents monthly. • Created procedures to detect and crop card regions from images. • Developed perspective correction and field segmentation stages for high-accuracy extraction. • Built and tuned Arabic OCR models for structured information labeling. • Processed over 5,000 documents and reduced manual data entry time by 90%.

2023 - 2025

Education

E

EpsilonAI Institute

Certificate in Data Science, Data Science

Certificate in Data Science

2021 - 2022

O

October 6 University

Bachelor of Science, Computer Science and Information Systems

Bachelor of Science

2015 - 2019

Work History

M

Marketing Company

AI Engineer

Manama

2025 - 2026

C

Clickits

Data Scientist

Cairo

2023 - 2025