For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
J

Juan David Marcillo Alba

Senior Web UI Developer

Colombia flagSantiago de Cali, Colombia
$30.00/hrIntermediateAws Sagemaker

Key Skills

Software

AWS SageMakerAWS SageMaker

Top Subject Matter

Retail - E-commerce
Entertainment
Banking

Top Data Types

Computer Code ProgrammingComputer Code Programming

Top Task Types

Evaluation Rating
Computer Programming Coding
Transcription
Question Answering

Freelancer Overview

Senior Web UI Developer. Brings 10+ years of professional experience across complex professional workflows, research, and quality-focused execution. Education includes Bachelor of Science, Universidad Autónoma de Occidente (2021).

IntermediateEnglish

Labeling Experience

Coding Challenge Complexity Evaluator

Computer Code ProgrammingComputer Programming Coding
Created and refined coding challenges for AI model benchmarking by iterating on prompt specifications, test suites, and reference solutions. Each task required ensuring the prompt was unambiguous and well-specified, writing 15+ test cases with full requirement coverage, and passing a complexity check where a weak model (Nova 2 Lite) fails at least once while a strong model (DeepSeek v3) passes at least twice. Tasks spanned multiple languages (Go, JavaScript, Python, Rust) and involved Docker-based test execution, coverage analysis (targeting 90%+ line and branch coverage), and iterative prompt/test refinement based on automated parity review feedback.

Created and refined coding challenges for AI model benchmarking by iterating on prompt specifications, test suites, and reference solutions. Each task required ensuring the prompt was unambiguous and well-specified, writing 15+ test cases with full requirement coverage, and passing a complexity check where a weak model (Nova 2 Lite) fails at least once while a strong model (DeepSeek v3) passes at least twice. Tasks spanned multiple languages (Go, JavaScript, Python, Rust) and involved Docker-based test execution, coverage analysis (targeting 90%+ line and branch coverage), and iterative prompt/test refinement based on automated parity review feedback.

2025 - 2026

Behavioral Code Debugging Annotator

Computer Code ProgrammingComputer Programming Coding
Annotated AI model debugging sessions using a 14-code behavioral taxonomy tracking how models approach code investigation, error diagnosis, and fix implementation. Tasks involved comparing model trajectories, identifying debugging patterns, and producing structured reviews with trajectory comparison across multiple programming languages. Each session was reviewed for correctness of diagnosis, efficiency of investigation strategy, and quality of the proposed fix relative to the actual codebase state.

Annotated AI model debugging sessions using a 14-code behavioral taxonomy tracking how models approach code investigation, error diagnosis, and fix implementation. Tasks involved comparing model trajectories, identifying debugging patterns, and producing structured reviews with trajectory comparison across multiple programming languages. Each session was reviewed for correctness of diagnosis, efficiency of investigation strategy, and quality of the proposed fix relative to the actual codebase state.

2025 - 2026

Process Reward Model Annotator

Computer Code ProgrammingComputer Programming Coding
Performed turn-level evaluation of Cline coding assistant conversations using the Datagen-PRM VS Code extension. Each bot turn was assessed across 11 metrics: correctness, completeness, independence, execution efficiency, reasoning quality (1-5 scale), and 5 reasoning chain annotations (thought-to-action alignment, thought continuity, action continuity, result-to-thought influence, result-to-action influence). Provided detailed justifications for each metric and wrote 50-200 word turn-level explanations grounded in concrete evidence. Session-level assessments included overall pass/fail rating, visual aesthetics, task categorization, and persona classification.

Performed turn-level evaluation of Cline coding assistant conversations using the Datagen-PRM VS Code extension. Each bot turn was assessed across 11 metrics: correctness, completeness, independence, execution efficiency, reasoning quality (1-5 scale), and 5 reasoning chain annotations (thought-to-action alignment, thought continuity, action continuity, result-to-thought influence, result-to-action influence). Provided detailed justifications for each metric and wrote 50-200 word turn-level explanations grounded in concrete evidence. Session-level assessments included overall pass/fail rating, visual aesthetics, task categorization, and persona classification.

2025 - 2026

Code Review & PR Quality Analyst

Computer Code ProgrammingComputer Programming Coding
Reviewed AI-generated pull requests against real GitHub issues from major open-source repositories (huggingface/transformers, scikit-learn, keras, yt-dlp). Tasks included generating reproducible Docker environments, running baseline test suites, comparing model trajectories against ground-truth PRs, evaluating code correctness and test coverage, and producing structured feedback. Managed dependency pinning, Dockerfile generation, and test verification across Python, JavaScript, and Rust ecosystems. Each review included checklist-based assessment and iterative feedback with re-evaluation cycles.

Reviewed AI-generated pull requests against real GitHub issues from major open-source repositories (huggingface/transformers, scikit-learn, keras, yt-dlp). Tasks included generating reproducible Docker environments, running baseline test suites, comparing model trajectories against ground-truth PRs, evaluating code correctness and test coverage, and producing structured feedback. Managed dependency pinning, Dockerfile generation, and test verification across Python, JavaScript, and Rust ecosystems. Each review included checklist-based assessment and iterative feedback with re-evaluation cycles.

2025 - 2026

AI Model Trajectory Evaluator

Computer Code ProgrammingTranscription
Evaluated pairs of AI coding assistant trajectories across 9 quality axes (correctness, naming, organization, error handling, documentation, review-readiness, logic, honesty, instruction following). Each task involved reading full model conversations (2,000-12,000+ lines), annotating strengths and weaknesses using a 13-code taxonomy (INST, OVERENG, TOOL, LAZY, VERIFY, FALSE, ROOT, DESTRUCT, FILE, HALLUC, DOCS, VERBOSE, FORMAT), verifying every claim against the actual trajectory content, and writing a comparative justification grounded in specific turn references. Output passed AI detection screening on all submissions. Handled multiple programming languages including Rust, Python, TypeScript, Go, and C#.

Evaluated pairs of AI coding assistant trajectories across 9 quality axes (correctness, naming, organization, error handling, documentation, review-readiness, logic, honesty, instruction following). Each task involved reading full model conversations (2,000-12,000+ lines), annotating strengths and weaknesses using a 13-code taxonomy (INST, OVERENG, TOOL, LAZY, VERIFY, FALSE, ROOT, DESTRUCT, FILE, HALLUC, DOCS, VERBOSE, FORMAT), verifying every claim against the actual trajectory content, and writing a comparative justification grounded in specific turn references. Output passed AI detection screening on all submissions. Handled multiple programming languages including Rust, Python, TypeScript, Go, and C#.

2025 - 2026

Education

U

Universidad Autónoma de Occidente

Bachelor of Science, Systems Engineering

Bachelor of Science
2016 - 2021

Work History

G

Globant Sistemas Colombia

Senior Web UI Developer

Santiago de Cali
2025 - 2025
G

Globant Sistemas Colombia

Senior Web UI Developer / Technical Lead

Santiago de Cali
2021 - 2024