For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Marvin Bulahan

Marvin Bulahan

AI agent training(eval, json diffs, task), Labelling/Rating(txt, img etc)

Philippines flagBinan, Philippines
$20.00/hrIntermediateOtherInternal Proprietary Tooling

Key Skills

Software

Other
Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code ProgrammingComputer Code Programming
ImageImage
TextText

Top Task Types

Computer Programming Coding
Diagnosis
Evaluation Rating
Function Calling
RLHF

Freelancer Overview

I am a data annotation and AI evaluation specialist with hands-on experience working on complex agentic AI systems. In my current role at a stealth AI startup, I review multi-step agent behaviors, validate tool-use decisions, analyze JSON state transitions, and produce gold-standard reasoning paths for autonomous agent training. I’m skilled at identifying ambiguities, surfacing missing assumptions, refining edge cases, and documenting ideal vs. acceptable vs. incorrect reasoning outcomes. In parallel, I work with Welocalize (WeLo Data) on search quality, ads relevance, and multimodal annotation projects, developing strong judgment in policy-based evaluation, linguistic nuance, and structured data interpretation. I consistently deliver high-accuracy work, strong analytical insights, and clear evaluation logic. My focus is producing reliable, high-fidelity data that strengthens AI reasoning, robustness, and real-world performance.

IntermediateTagalogEnglishJapanese

Labeling Experience

AI Data Operations - Data Annotator

Internal Proprietary ToolingDocumentRLHF
Working on a large-scale autonomous agent training project focused on evaluating multi-step reasoning, tool-use correctness, and JSON-based state transitions. Tasks include annotating agent decision paths, identifying ambiguities, classifying errors, refining task instructions, and defining gold-standard behaviors across diverse scenarios. The project involves thousands of agent runs and multi-layer evaluations, requiring strict adherence to accuracy benchmarks, consistency checks, and multi-review quality controls. Deliverables undergo periodic audits, cross-review validation, and performance scoring to maintain high-quality training data for agentic AI systems.

Working on a large-scale autonomous agent training project focused on evaluating multi-step reasoning, tool-use correctness, and JSON-based state transitions. Tasks include annotating agent decision paths, identifying ambiguities, classifying errors, refining task instructions, and defining gold-standard behaviors across diverse scenarios. The project involves thousands of agent runs and multi-layer evaluations, requiring strict adherence to accuracy benchmarks, consistency checks, and multi-review quality controls. Deliverables undergo periodic audits, cross-review validation, and performance scoring to maintain high-quality training data for agentic AI systems.

2025

Data Annotator, Data Analyst

OtherTextEvaluation Rating
Providing high-quality data annotation and evaluation for search quality, ads relevance, and multimodal AI training projects. Tasks include reviewing search queries, rating content relevance, annotating text/image/video data, and applying strict policy guidelines across large-volume datasets. The project spans thousands of items per cycle and requires maintaining high accuracy, consistency, and adherence to detailed instructions. Quality is measured through periodic audits, calibration tests, peer reviews, and continuous performance scoring to ensure reliable training data for large-scale AI systems.

Providing high-quality data annotation and evaluation for search quality, ads relevance, and multimodal AI training projects. Tasks include reviewing search queries, rating content relevance, annotating text/image/video data, and applying strict policy guidelines across large-volume datasets. The project spans thousands of items per cycle and requires maintaining high accuracy, consistency, and adherence to detailed instructions. Quality is measured through periodic audits, calibration tests, peer reviews, and continuous performance scoring to ensure reliable training data for large-scale AI systems.

2024

Education

D

Datacamp (Data Engineering Pilipinas)

Data Analytics, Data Engineering, AI Engineering, Data Science

Data Analytics, Data Engineering, AI Engineering
2024 - 2025
C

Coursera (Google Courses)

Data/ BI Analytics, UI/UX Design, Project Management, Data/ BI Analytics, UI/UX Design, Project Management

Data/ BI Analytics, UI/UX Design, Project Management
2023 - 2025

Work History

A

AI Startup (Stealth)

AI Data Operations - Data Annotator

New York
2025 - Present
W

Welocalize (Welo Data)

Data Annotator

Binan
2024 - Present