For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
G

Gibbs Lee

AI Training Data Designer and Problem Author

USA flagAustin, TX, Usa
Expert

Key Skills

Software

No software listed

Top Subject Matter

Stem Domain Expertise
Machine Learning
Advanced Reasoning

Top Data Types

DocumentDocument
TextText

Top Task Types

Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)

Freelancer Overview

AI Training Data Designer and Problem Author. Brings 11+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, University of Texas at Austin (2016). AI-training focus includes data types such as Computer Code and Programming and labeling workflows including Prompt + Response Writing (SFT), Evaluation, and Rating.

Expert

Labeling Experience

Advanced AI Training Problem Author

Prompt Response Writing SFT
Designed a library of over 80 original, computationally intensive STEM and ML problems for advanced reasoning and GenAI model training use. Each problem set included validated Python solutions with comprehensive documentation suitable for fine-tuning and evaluation tasks. Subject matter focused on optimization, probabilistic inference, feature selection, and testing advanced coding proficiency. • Emphasized solution reproducibility and C1+ English technical clarity. • Used Python (NumPy, Pandas, SciPy, scikit-learn) for problem creation and testing. • Verified test cases for each problem to ensure correctness for training datasets. • Problems intended for direct use in AI/ML system refinement and assessment.

Designed a library of over 80 original, computationally intensive STEM and ML problems for advanced reasoning and GenAI model training use. Each problem set included validated Python solutions with comprehensive documentation suitable for fine-tuning and evaluation tasks. Subject matter focused on optimization, probabilistic inference, feature selection, and testing advanced coding proficiency. • Emphasized solution reproducibility and C1+ English technical clarity. • Used Python (NumPy, Pandas, SciPy, scikit-learn) for problem creation and testing. • Verified test cases for each problem to ensure correctness for training datasets. • Problems intended for direct use in AI/ML system refinement and assessment.

2023 - Present

AI Training Data Designer and Problem Author

Prompt Response Writing SFT
Designed and authored over 40 original computationally intensive STEM and ML problem sets to be used as training data for internal AI training datasets. Each problem included step-by-step Python-based solutions, advanced reasoning requirements, and fully documented validation at C1+ English proficiency. Problem-solution pairs were reviewed and subjected to automated ML solution verification processes prior to dataset inclusion. • Integrated solution verification workflows using Python (NumPy, Pandas, SciPy, scikit-learn). • Utilized GenAI tools such as OpenAI API, IBM Watson, and LangChain for pipeline automation. • Established quality checkpoints and protocols for data set milestone validations. • Authored reproducible, high-quality technical documentation for problem-solution datasets.

Designed and authored over 40 original computationally intensive STEM and ML problem sets to be used as training data for internal AI training datasets. Each problem included step-by-step Python-based solutions, advanced reasoning requirements, and fully documented validation at C1+ English proficiency. Problem-solution pairs were reviewed and subjected to automated ML solution verification processes prior to dataset inclusion. • Integrated solution verification workflows using Python (NumPy, Pandas, SciPy, scikit-learn). • Utilized GenAI tools such as OpenAI API, IBM Watson, and LangChain for pipeline automation. • Established quality checkpoints and protocols for data set milestone validations. • Authored reproducible, high-quality technical documentation for problem-solution datasets.

2021 - Present

GenAI-Assisted ML Solution Verification and Evaluation Engineer

Built and operated an automated pipeline that used GenAI APIs and Python scripting to verify and evaluate machine learning problem solutions. Automated cross-checking flagged numerical inconsistencies, validated statistical claims, and generated quality-controlled documentation for dataset incorporation. This pipeline supported data annotation tasks by ensuring only rigorously validated solutions were retained for AI fine-tuning sets. • Integrated OpenAI API for automated verification and quality control. • Maintained over 99% solution accuracy across all reviewed problems. • Generated documentation aligned with scientific rigor and reproducibility standards. • Reduced manual review time while enhancing labeling reliability for model training.

Built and operated an automated pipeline that used GenAI APIs and Python scripting to verify and evaluate machine learning problem solutions. Automated cross-checking flagged numerical inconsistencies, validated statistical claims, and generated quality-controlled documentation for dataset incorporation. This pipeline supported data annotation tasks by ensuring only rigorously validated solutions were retained for AI fine-tuning sets. • Integrated OpenAI API for automated verification and quality control. • Maintained over 99% solution accuracy across all reviewed problems. • Generated documentation aligned with scientific rigor and reproducibility standards. • Reduced manual review time while enhancing labeling reliability for model training.

2022 - 2022

AI Training Curriculum Problem Designer

Prompt Response Writing SFT
Created computationally intensive Python-based problem sets for internal ML engineer onboarding and proficiency assessment, contributing to curriculum design for AI model training. Developed original prompts and grading solutions to evaluate full data science stack capabilities, with problems used as supervised input/output pairs in ML curricula. Leveraged generative AI writing tools to streamline prompt authoring and documentation generation for curriculum datasets. • Engaged in advanced statistical validation and testing for generated problem solutions. • Used Python (NumPy, Pandas, SciPy, statsmodels, scikit-learn) and SQL databases for data handling. • Automated parts of the authoring process using LangChain and OpenAI API. • Produced structured, reproducible scientific documentation for QA review.

Created computationally intensive Python-based problem sets for internal ML engineer onboarding and proficiency assessment, contributing to curriculum design for AI model training. Developed original prompts and grading solutions to evaluate full data science stack capabilities, with problems used as supervised input/output pairs in ML curricula. Leveraged generative AI writing tools to streamline prompt authoring and documentation generation for curriculum datasets. • Engaged in advanced statistical validation and testing for generated problem solutions. • Used Python (NumPy, Pandas, SciPy, statsmodels, scikit-learn) and SQL databases for data handling. • Automated parts of the authoring process using LangChain and OpenAI API. • Produced structured, reproducible scientific documentation for QA review.

2018 - 2021

STEM/AI Training Problem Author for Analyst Onboarding

Prompt Response Writing SFT
Authored original STEM-based analytical problems and Python solution scripts used for training and upskilling junior analysts in a government risk analytics context. Problems were reviewed by senior data scientists before inclusion in internal AI training programs and documentation libraries. Each submission was thoroughly validated for correctness, reproducibility, and advanced analytical reasoning. • Focused on classification, regression, and probabilistic inference for federal datasets. • Utilized Python (scikit-learn, NumPy, Pandas, statsmodels) and SQL for solution implementation. • Produced technical documentation matching rigorous internal QA standards. • Ensured cross-functional utility and technical clarity for team training goals.

Authored original STEM-based analytical problems and Python solution scripts used for training and upskilling junior analysts in a government risk analytics context. Problems were reviewed by senior data scientists before inclusion in internal AI training programs and documentation libraries. Each submission was thoroughly validated for correctness, reproducibility, and advanced analytical reasoning. • Focused on classification, regression, and probabilistic inference for federal datasets. • Utilized Python (scikit-learn, NumPy, Pandas, statsmodels) and SQL for solution implementation. • Produced technical documentation matching rigorous internal QA standards. • Ensured cross-functional utility and technical clarity for team training goals.

2016 - 2018

Education

U

University of Texas at Austin

Bachelor of Science, Computer Science

Bachelor of Science
2012 - 2016

Work History

I

IBM

Senior Machine Learning Engineer

Austin, TX
2021 - Present
D

Dell Technologies

Machine Learning Engineer

Round Rock, TX
2018 - 2021