For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
V
Vincent Kalu

Vincent Kalu

Senior ML Engineer - Document Data Annotation and Segmentation

United Kingdom flagRemote, United Kingdom
$50.00/hrExpertSuperannotateRoboflowAws Sagemaker

Key Skills

Software

SuperAnnotateSuperAnnotate
RoboflowRoboflow
AWS SageMakerAWS SageMaker
LabelboxLabelbox

Top Subject Matter

Biometric Identity Documents
Legal Documents
Crime Reports

Top Data Types

DocumentDocument
TextText
ImageImage

Top Task Types

SegmentationSegmentation
Entity (NER) ClassificationEntity (NER) Classification
Object DetectionObject Detection

Freelancer Overview

Senior ML Engineer - Document Data Annotation and Segmentation. Brings 13+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, National Open University of Nigeria (2025). AI-training focus includes data types such as Document, Text, and Image and labeling workflows including Segmentation, Entity (NER) Classification, and Object Detection.

ExpertEnglish

Labeling Experience

Lead AI Engineer & Technical Co-Founder – NLP Data Labeling and Entity Extraction

TextEntity Ner Classification
I designed and managed NLP pipelines focused on data labeling of unstructured crime report text for entity recognition and metadata tagging. My responsibility included custom text chunking strategies and entity classification to support semantic document retrieval in live production systems. I ensured labeled datasets powered accurate crime classification via large language models and vector search systems. • Built workflows for named entity recognition and sensitive data (PII) detection in police and crime reports. • Generated metadata tags and semantic segments for crime-related document bodies. • Executed custom chunking for sentence, paragraph, and sliding window level annotation. • Provided labeled text sets enabling classification, retrieval, and LLM prompt development.

I designed and managed NLP pipelines focused on data labeling of unstructured crime report text for entity recognition and metadata tagging. My responsibility included custom text chunking strategies and entity classification to support semantic document retrieval in live production systems. I ensured labeled datasets powered accurate crime classification via large language models and vector search systems. • Built workflows for named entity recognition and sensitive data (PII) detection in police and crime reports. • Generated metadata tags and semantic segments for crime-related document bodies. • Executed custom chunking for sentence, paragraph, and sliding window level annotation. • Provided labeled text sets enabling classification, retrieval, and LLM prompt development.

2020 - Present

ML Engineer – Computer Vision Data Annotation for Mobile AI Platform

ImageObject Detection
I developed and deployed mobile AI computer vision pipelines that annotated images for real-time object and facial recognition in financial crime investigations. Tasks included image preprocessing, labeling for document verification, facial matching, and evidence detection using TensorFlow Lite. The labeled datasets were used for training, validating, and monitoring on-device inference systems for law enforcement. • Labeled face, object, and document boundaries for AI-driven investigation apps. • Processed and tagged images with OCR, metadata, and tamper-proof audit trails. • Created evidence tagging workflows to support fraud and financial crime analysis. • Enabled 30+ FPS detection on mobile platforms by refining and validating image data labels.

I developed and deployed mobile AI computer vision pipelines that annotated images for real-time object and facial recognition in financial crime investigations. Tasks included image preprocessing, labeling for document verification, facial matching, and evidence detection using TensorFlow Lite. The labeled datasets were used for training, validating, and monitoring on-device inference systems for law enforcement. • Labeled face, object, and document boundaries for AI-driven investigation apps. • Processed and tagged images with OCR, metadata, and tamper-proof audit trails. • Created evidence tagging workflows to support fraud and financial crime analysis. • Enabled 30+ FPS detection on mobile platforms by refining and validating image data labels.

2024 - 2025

Senior ML Engineer - Document Data Annotation and Segmentation

DocumentSegmentation
I implemented advanced document parsing and data labeling pipelines for unstructured biometric registration documents, including OCR, semantic segmentation, and metadata extraction. My work involved hierarchical document processing (forms, tables, nested layouts) and integrating computer vision with language models for automated data validation. I ensured high-quality, labeled datasets supporting multimodal AI platform accuracy for 15 West African countries. • Designed multilayered OCR workflows using LayoutLM and ResNet for document and image annotation. • Labeled text and tabular data within scanned legal and biometric records for identity validation. • Automated entity extraction and semantic region segmentation across complex document layouts. • Delivered labeled datasets to enable model fine-tuning, structured retrieval, and table recognition.

I implemented advanced document parsing and data labeling pipelines for unstructured biometric registration documents, including OCR, semantic segmentation, and metadata extraction. My work involved hierarchical document processing (forms, tables, nested layouts) and integrating computer vision with language models for automated data validation. I ensured high-quality, labeled datasets supporting multimodal AI platform accuracy for 15 West African countries. • Designed multilayered OCR workflows using LayoutLM and ResNet for document and image annotation. • Labeled text and tabular data within scanned legal and biometric records for identity validation. • Automated entity extraction and semantic region segmentation across complex document layouts. • Delivered labeled datasets to enable model fine-tuning, structured retrieval, and table recognition.

2024 - 2025

Education

N

National Open University of Nigeria

Bachelor of Science, Computer Science

Bachelor of Science
2014 - 2025

Work History

M

Mudian Limited

Senior Data Engineer

Remote
2025 - Present
A

Algoteam Software Labs

Lead AI Engineer & Technical Co-Founder

Abuja
2020 - Present