Samson Shittu - Machine Learning Engineer/Data Annotator

Key Skills

Software

Google Cloud Vertex AI

Label Studio

Top Subject Matter

Multi-modal AI

Retrieval-Augmented Generation

Document Processing

Top Data Types

Text

Image

Document

Top Task Types

Classification

Freelancer Overview

My experience with AI training data centers on the engineering and pipeline development required to extract, structure, and optimize datasets for advanced machine learning models. As a Machine Learning Engineer, I have built robust, production-grade systems for structured data extraction from complex documents using OCR, layout-aware LLMs, and custom validation logic—essential steps in transforming raw information into highly accurate training assets. A key highlight of my data optimization work involved leading the development of a data deduplication system at DataPath, where I fine-tuned custom embedding models on domain-specific data to improve entity matching accuracy by 35x, significantly elevating the quality and reliability of the underlying datasets. Beyond data extraction and structuring, I have extensive, hands-on experience curating and formatting specialized training data to fine-tune open-source LLMs (such as Mistral and the Llama series) for specialized client tasks like sentiment analysis and code generation. Coupled with my work optimizing Retrieval-Augmented Generation (RAG) pipelines through data context refinement, query transformation, and re-ranking, I deeply understand the lifecycle of AI data preparation. By combining my background in data processing tools (Pandas, Polars) with a strong foundation in NLP and machine learning frameworks like PyTorch and Hugging Face, I possess the technical expertise required to turn raw, unstructured data into the high-quality training assets needed for state-of-the-art AI performance. Education includes Bachelor of Science, University of Ilorin (2023)

IntermediateEnglish

Labeling Experience

Freelancer | Machine Learning Engineer (contract)

OtherTextFine Tuning

I designed and implemented fine-tuning strategies for open-source large language models to perform specialized tasks. These included sentiment analysis and code generation, adapting models for client-specific use cases. Deliverables were comprehensive codebases, detailed documentation, and hands-on support for client teams. • Implemented LoRA and other parameter-efficient methods for LLM adaptation. • Gathered and curated textual datasets for supervised fine-tuning. • Provided knowledge transfer sessions to ensure ongoing client success. • Consulted on model selection between open-source and proprietary LLMs.

2021 - Present

Machine Learning Engineer (contract) | DataPath

Google Cloud Vertex AITextFine Tuning

I fine-tuned custom embedding models and LLMs using domain-specific data to improve downstream tasks. My responsibilities included preparing textual datasets, training and validating models, and deploying fine-tuned models for practical applications. These efforts led to measurable improvements in model entity matching, contextual understanding, and recommendation accuracy. • Architected multi-modal agentic pipelines utilizing fine-tuned models for real-time applications. • Built structured data extraction systems leveraging layout-aware LLMs and OCR technologies. • Led development efforts to improve model robustness through continuous evaluation and red-teaming. • Optimized and managed production inference endpoints for scalable model deployment.

2023 - 2025

Education

U

University of Ilorin

Bachelor of Science, Computer Science

Bachelor of Science

2018 - 2023

G

Good Shepherd Comprehensive High School

Secondary School Certificate, General Secondary Education

Secondary School Certificate

2015 - 2018

Work History

N

N/A

Freelancer | Machine Learning Engineer (contract)

N/A

2021 - Present

A

Acada

Product Manager

N/A

2025 - 2026