For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
B
Benjamin Ikuesan

Benjamin Ikuesan

AI Trainer & LLM Evaluator — Freelance (Outlier / Appen)

Poland flagGdansk, Poland
$10.00/hrExpertAppenGoogle Cloud Vertex AIDon T Disclose

Key Skills

Software

AppenAppen
Google Cloud Vertex AIGoogle Cloud Vertex AI
Don't disclose

Top Subject Matter

Software Engineering
Finance
A.I

Top Data Types

TextText
ImageImage
Computer Code ProgrammingComputer Code Programming

Top Task Types

RLHFRLHF
Object DetectionObject Detection
Action RecognitionAction Recognition
ClassificationClassification
Text GenerationText Generation
Evaluation/RatingEvaluation/Rating
TranscriptionTranscription
Computer Programming/CodingComputer Programming/Coding
Fine-tuningFine-tuning
Text SummarizationText Summarization
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)

Freelancer Overview

I am a technically proficient AI Trainer and data annotator with a Master's degree in Mathematical Engineering and over three years of hands on software development experience. Through my work on Outlier and Appen, I have evaluated and rated AI-generated outputs across code, text, images, and video, providing structured preference feedback that feeds directly into RLHF pipelines. My engineering background in Python, C++, JavaScript, and Bash means I don't just label code outputs. I genuinely understand them, catching subtle logical errors, inefficiencies, and edge case failures that non-technical annotators typically miss. I have also written high-quality prompt-response pairs for mathematics, reasoning, and software engineering domains, consistently meeting platform quality benchmarks. What sets me apart is the rare combination of mathematical rigour and practical engineering depth I bring to every task. My BSc in Mathematics and thesis work in numerical algorithms give me a strong foundation for evaluating logical reasoning, scientific accuracy, and proof-based content domains where most annotators struggle. I have hands-on experience with multi-modal annotation including image classification, video frame labelling, and audio transcription, alongside LLM evaluation tasks such as side-by-side model comparison, hallucination detection, and instruction-following assessment. Whether the task involves debugging AI-generated code, ranking competing model responses, or annotating complex visual data, I bring the technical credibility and attention to detail that produces training data of genuinely high quality.

ExpertEnglishItalianPolish

Labeling Experience

Appen

AI Trainer & LLM Evaluator — Freelance (Outlier / Appen)

AppenVideoAction Recognition
Labeled video data frame-by-frame for action recognition and object tracking, supporting the training of computer vision models. Performed temporal annotation, marking start and end points for specific actions or events within video segments. Ensured annotation quality through multi-modal dataset review and consistency checks across video samples. • Conducted frame-wise object tracking and action segmentation. • Annotated temporal boundaries and classified actions in video clips. • Evaluated dataset quality to identify and rectify inconsistencies in annotations. • Supported annotation tasks for varied video content in vision model development.

Labeled video data frame-by-frame for action recognition and object tracking, supporting the training of computer vision models. Performed temporal annotation, marking start and end points for specific actions or events within video segments. Ensured annotation quality through multi-modal dataset review and consistency checks across video samples. • Conducted frame-wise object tracking and action segmentation. • Annotated temporal boundaries and classified actions in video clips. • Evaluated dataset quality to identify and rectify inconsistencies in annotations. • Supported annotation tasks for varied video content in vision model development.

2024 - Present
Appen

AI Trainer & LLM Evaluator — Freelance (Outlier / Appen)

AppenImageObject Detection
Performed image labeling activities, such as bounding box annotation, segmentation, and classification for computer vision model development. Labeled and reviewed datasets for quality in tasks involving multi-class image recognition and keypoint annotation. Ensured consistent and high-quality labeling in line with model training standards and annotation benchmarks. • Completed image object detection and classification using various annotation tools. • Checked label consistency across diverse image classes and labeling styles. • Applied knowledge of mathematical structures to improve label accuracy and dataset integrity. • Contributed to classification and detection tasks for computer vision pipelines.

Performed image labeling activities, such as bounding box annotation, segmentation, and classification for computer vision model development. Labeled and reviewed datasets for quality in tasks involving multi-class image recognition and keypoint annotation. Ensured consistent and high-quality labeling in line with model training standards and annotation benchmarks. • Completed image object detection and classification using various annotation tools. • Checked label consistency across diverse image classes and labeling styles. • Applied knowledge of mathematical structures to improve label accuracy and dataset integrity. • Contributed to classification and detection tasks for computer vision pipelines.

2024 - Present
Appen

AI Trainer & LLM Evaluator — Freelance (Outlier / Appen)

AppenTextRLHF
Annotated, rated, and ranked LLM-generated text outputs for factual accuracy, coherence, instruction-following, and tone. Contributed structured preference data for RLHF pipelines and large language model fine-tuning. Identified errors, hallucinations, and logical inconsistencies in AI responses during side-by-side and standalone evaluations. • Performed text annotation tasks, including preference ranking and output rating. • Provided detailed written rationales and structured feedback for model improvement. • Evaluated long-form outputs and assessed quality according to domain-specific prompts. • Applied subject-matter expertise in logical and mathematical reasoning to text output review.

Annotated, rated, and ranked LLM-generated text outputs for factual accuracy, coherence, instruction-following, and tone. Contributed structured preference data for RLHF pipelines and large language model fine-tuning. Identified errors, hallucinations, and logical inconsistencies in AI responses during side-by-side and standalone evaluations. • Performed text annotation tasks, including preference ranking and output rating. • Provided detailed written rationales and structured feedback for model improvement. • Evaluated long-form outputs and assessed quality according to domain-specific prompts. • Applied subject-matter expertise in logical and mathematical reasoning to text output review.

2024 - Present
Appen

AI Trainer & LLM Evaluator — Freelance (Outlier / Appen)

AppenComputer Code ProgrammingRLHF
Evaluated AI-generated code outputs in Python, JavaScript, and C++ for correctness, efficiency, and style. Provided structured feedback and annotated code to enhance the quality of training data for large language models. Rated, ranked, and reviewed code-focused model outputs as part of RLHF and LLM training pipelines. • Reviewed and debugged AI-generated code examples for accuracy and instructional value. • Provided test coverage feedback and logical reasoning assessments in software engineering tasks. • Assessed coding prompt responses for OOP/SOLID principle adherence and creativity. • Created and evaluated prompt-response pairs in programming, mathematics, and reasoning domains.

Evaluated AI-generated code outputs in Python, JavaScript, and C++ for correctness, efficiency, and style. Provided structured feedback and annotated code to enhance the quality of training data for large language models. Rated, ranked, and reviewed code-focused model outputs as part of RLHF and LLM training pipelines. • Reviewed and debugged AI-generated code examples for accuracy and instructional value. • Provided test coverage feedback and logical reasoning assessments in software engineering tasks. • Assessed coding prompt responses for OOP/SOLID principle adherence and creativity. • Created and evaluated prompt-response pairs in programming, mathematics, and reasoning domains.

2024 - Present

Education

G

Gdansk University of Technology

PhD., Quantum Computation - (Machine Learning)

PhD.
2025 - 2026
G

Gdansk University of Technology

Master of Science , Nanotechnology

Master of Science
2022 - 2024

Work History

M

Mthree

SRE / QA & Automation Engineer Intern

Warsaw
2024 - 2024