Benel Frazile - STEM Expert and AI Code Evaluation and Scalable Testing Systems Expert.

Key Skills

Software

Appen

Axiom AI

Clickworker

CloudFactory

CVAT

Data Annotation Tech

Google Cloud Vertex AI

Img Lab

Labelbox

LabelImg

Label Studio

Lionbridge

Mindrift

OneForma

Playment

Redbrick AI

Remotasks

Scale AI

Snorkel AI

SuperAnnotate

Surge AI

Toloka

Telus

Trilldata Technologies

V7 Labs

Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code Programming

Text

Top Task Types

Computer Programming Coding

Evaluation Rating

Fine Tuning

Prompt Response Writing SFT

RLHF

Freelancer Overview

As a Full-Stack Engineer with 8+ years of specialized experience in AI system evaluation and data collection, I have developed comprehensive expertise in creating training data pipelines and evaluation frameworks for AI coding assistants and LLMs. At Cognition, I built and maintained data collection systems processing over 3.2 million daily user interactions with AI coding tools, creating massive datasets for model training and performance optimization. My work involved developing 750+ automated test cases specifically designed to evaluate AI agent code generation capabilities. My unique expertise lies in bridging the gap between AI system development and rigorous evaluation methodologies, with deep experience in prompt engineering for code generation tasks and systematic assessment of LLM outputs. I have processed over 8 million daily coding interactions from AI agents, analyzing behavioral patterns and creating sophisticated evaluation frameworks that measure real-world AI performance on complex tasks like repository migrations and bug fixes. This extensive hands-on experience with large-scale AI training data collection, combined with my mathematical background in statistical analysis and A/B testing methodologies, positions me to contribute effectively to AI training initiatives across computer science, mathematics, and computational sciences applications in chemistry and physics.

ExpertFrenchEnglishSpanish

Labeling Experience

Data Labeling Experience Title: AI Code Generation Evaluation and Training Data Collection

Internal Proprietary ToolingComputer Code ProgrammingRLHFFine Tuning

Led comprehensive data labeling initiative for AI coding assistant training, creating 750+ automated test cases and evaluation frameworks across JavaScript, TypeScript, Python, and Java ecosystems. Developed systematic labeling protocols for code quality assessment, correctness validation, and performance measurement of AI-generated code samples. Processed and labeled over 8 million daily coding interactions from AI agents, creating structured datasets for model training and behavioral analysis. Established evaluation standards for repository migration tasks, bug fixes, and feature implementations, with focus on prompt engineering optimization and response quality assessment for supervised fine-tuning.

2023 - 2025

Education

S

Stanford University

Master of Science, Software Engineering

Master of Science

2020 - 2022

Work History

C

Cognition

Senior Full-Stack Engineer - AI Evaluation Systems

Remote