For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Charles Ngaire

Charles Ngaire

Expert in AI computer coding in python

USA flagCalifornia, Usa
$18.00/hrExpertClickworkerCrowdsourceData Annotation Tech

Key Skills

Software

ClickworkerClickworker
CrowdSourceCrowdSource
Data Annotation TechData Annotation Tech
iMeritiMerit
MercorMercor
MindriftMindrift
ProdigyProdigy
RemotasksRemotasks
TolokaToloka
TelusTelus
Scale AIScale AI

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code ProgrammingComputer Code Programming
Medical DicomMedical Dicom
TextText

Top Task Types

Classification
Computer Programming Coding
Geocoding
Prompt Response Writing SFT
Text Generation

Freelancer Overview

My expertise lies at the intersection of data labeling and software engineering, with a specialized focus on creating high-quality training data for code intelligence and large language models. I am proficient in the full pipeline, from designing annotation guidelines and managing labeling teams to building custom tools with Python, AST parsers, and libraries like Tree-sitter to automate and scale the data generation process. My work is grounded in a data-centric AI philosophy, aiming to systematically improve model performance through meticulously curated datasets. What sets me apart is my deep technical ability to understand and label complex, structured data like source code. I don't just annotate text; I engineer datasets for specific learning objectives, such as code summarization, bug detection, and program synthesis. My research and projects, including developing novel labeling pipelines for instruction-following data and improving code search via contrastive learning, demonstrate a proven track record of creating data that directly enhances model capabilities on challenging, real-world tasks.

ExpertEnglishSpanish

Labeling Experience

Mindrift

Instruction-Tuning Dataset for Code Generation Models

MindriftComputer Code ProgrammingComputer Programming CodingPrompt Response Writing SFT
The scope of this project was to create a high-quality, instruction-tuning dataset to improve the code generation capabilities of a large language model.

The scope of this project was to create a high-quality, instruction-tuning dataset to improve the code generation capabilities of a large language model.

2022 - 2023

Education

S

Stanford University

Master of Science in Computer Science , Computer science

Master of Science in Computer Science
2023 - 2025

Work History

G

Google

AI Engineering Intern

Mountain View
2024 - 2024