For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
L

Louis Ontiberos

Machine Learning Data Labeler and AI Dataset Specialist

USA flag
Chico, Usa
$25.00/hrIntermediate

Key Skills

Software

No software listed

Top Subject Matter

Programming Domain Expertise
C++ Domain Expertise
Machine Learning Training

Top Data Types

TextText
AudioAudio

Top Task Types

Prompt Response Writing SFT
Classification

Freelancer Overview

Machine Learning Data Labeler and AI Dataset Specialist. Brings 2+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, University of California, Davis (2023). AI-training focus includes data types such as Computer Code, Programming, and Text and labeling workflows including Prompt + Response Writing (SFT) and Classification.

IntermediateEnglish

Labeling Experience

Machine Learning Data Labeler and AI Dataset Specialist

Prompt Response Writing SFT
Generated and labeled instruction-response pairs for supervised fine-tuning datasets tailored to programming and coding models. Extracted C++ code samples from GitHub repositories, formatted datasets, handled deduplication, and performed quality control checks. Documented data sources and followed reproducible labeling workflows. • Worked extensively with code data and technical text • Ensured high-quality, curated data for machine learning pipelines • Utilized labeling and dataset preprocessing tools • Supported accuracy and adherence to detailed requirements

Generated and labeled instruction-response pairs for supervised fine-tuning datasets tailored to programming and coding models. Extracted C++ code samples from GitHub repositories, formatted datasets, handled deduplication, and performed quality control checks. Documented data sources and followed reproducible labeling workflows. • Worked extensively with code data and technical text • Ensured high-quality, curated data for machine learning pipelines • Utilized labeling and dataset preprocessing tools • Supported accuracy and adherence to detailed requirements

2025 - Present

Audio DSP Coding Dataset for LLM Fine-Tuning - Independent Project

Prompt Response Writing SFT
Developed a supervised fine-tuning dataset composed of DSP-focused C++ code and structured instruction-response pairs for LLM training. Converted audio processing tutorials and open-source codebases into training examples following ChatML format. Applied rigorous dataset filtering and deduplication for high data quality. • Utilized LLM-assisted pipelines for dataset generation • Prepared datasets for QLoRA fine-tuning • Included deep coverage of audio plugin coding topics • Ensured comprehensive labeling for DSP model development

Developed a supervised fine-tuning dataset composed of DSP-focused C++ code and structured instruction-response pairs for LLM training. Converted audio processing tutorials and open-source codebases into training examples following ChatML format. Applied rigorous dataset filtering and deduplication for high data quality. • Utilized LLM-assisted pipelines for dataset generation • Prepared datasets for QLoRA fine-tuning • Included deep coverage of audio plugin coding topics • Ensured comprehensive labeling for DSP model development

2025 - 2025

AI Data Annotation Assistant

TextClassification
Labeled and validated datasets composed of technical documentation and code samples for machine learning applications. Managed categorization and annotation of examples to build supervised training data. Conducted dataset verification and consistency checks for annotation quality. • Maintained dataset documentation for accuracy • Executed labeling tasks for various data types • Supported dataset preparation for NLP and coding models • Ensured annotation adherence to guidelines

Labeled and validated datasets composed of technical documentation and code samples for machine learning applications. Managed categorization and annotation of examples to build supervised training data. Conducted dataset verification and consistency checks for annotation quality. • Maintained dataset documentation for accuracy • Executed labeling tasks for various data types • Supported dataset preparation for NLP and coding models • Ensured annotation adherence to guidelines

2024 - 2024

Education

U

University of California, Davis

Bachelor of Science, Computer Science

Bachelor of Science
2019 - 2023

Work History

N

North Valley Analytics

Data Processing Assistant

Chico
2023 - 2024