Daniel Samoei - AI Data Trainer and Evaluator | Text, Image, and Audio Annotation Expert

Key Skills

Software

CloudFactory

CVAT

Google Cloud Vertex AI

Scale AI

Remotasks

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code Programming

Document

Text

Top Task Types

Audio Recording

Computer Programming Coding

Evaluation Rating

Prompt Response Writing SFT

Translation Localization

Freelancer Overview

I am an experienced Python programmer and data analyst specializing in automation, data integration, and analytics throughout the time of more than 5 years. My activities centers on the development of scalable repeatable systems to transform raw data into AI-ready data. I possess Python, SQL, and RESTful API knowledge and have automated workflows, pipelines, and containers with Docker, and incorporated real-time data into BI systems. I focus currently on the text evaluation, prompt engineering, and computer vision annotation as my field of specialization in AI training and data labeling. I also possess certain practical experience of operating such tools as Label Studio, CVAT, Roboflow, andLabelbox that may be useful when dealing with a substantial amount of data that requires labelling and validation programmes. My understanding of automation and API-based workflow is good, and it makes me precise, reliable, and capable of providing quality of various AI training programs.

ExpertEnglish

Labeling Experience

AI Text and Code Evaluation for LLM Fine-Tuning

Scale AIComputer Code ProgrammingTranslation LocalizationEvaluation Rating

Contributed to the testing and troubleshooting of a large-scale multilingual text generation, summarization, and function-calling model. Responsibilities included writing and revising prompt–response pairs, evaluating model outputs based on accuracy and coherence, and identifying entities for downstream NLP processes. Additionally, developed Python scripts to automatically validate and ensure consistency across annotation batches. The project involved more than 100,000 text samples and adhered to strict quality standards, including inter-annotator agreement and multi-phase review protocols. Supported the integration of labeled datasets into an automated CI/CD pipeline, enabling continuous model retraini

2021 - 2024

Education

C

Coursera

Professional Certificate, Data Analysis

Professional Certificate

2020 - 2020

M

Moi University

Bachelor of Science, Physics

Bachelor of Science

2014 - 2019

Work History

Q

Quantumnest Global

Senior Business Analyst

Remote

2023 - 2025

A

Amgen

Business Systems Analyst

Remote

2022 - 2024