For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Jacob Martinez

Jacob Martinez

Expert in LLM Evaluation & Multilingual Text Annotation

USA flagdallas, Usa
$20.00/hrExpertAppenData Annotation TechLabelbox

Key Skills

Software

AppenAppen
Data Annotation TechData Annotation Tech
LabelboxLabelbox
MindriftMindrift
RemotasksRemotasks
Scale AIScale AI

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code ProgrammingComputer Code Programming
ImageImage
TextText

Top Task Types

Classification
Data Collection
Evaluation Rating
Question Answering
Translation Localization

Freelancer Overview

With over five years of training data and AI labeling experience, I am experienced in multilingual text labeling and large language model (LLM) testing with a focus on English and Spanish datasets. My expertise involves natural language processing (NLP) tasks like named entity recognition (NER), text generation, and translation/localization using tools such as Labelbox, Datasaur, and Doccano. I have developed high-value projects such as optimizing conversational AI models for customer service scenarios and testing LLMs for cultural nuance accuracy to deliver strong and contextually relevant outputs. My ability to handle difficult text and document datasets, as well as experience in proprietary tooling, enables me to deliver accurate and scalable annotations that enhance model performance in multiple industries. My specialty is my higher understanding of linguistic subtleties and capacity to bridge language gaps in AI training data for multilingual products. I have a history of working with cross-functional teams to develop high-quality datasets for conversational AI and sentiment analysis with data integrity and model results that are optimized. My experience in audio analysis for speech recognition systems also enhances my skillset to enable me to contribute to multimodal AI training projects. My commitment to accuracy and versatility translates well to tasks requiring culturally aware and perceptive data annotation.

ExpertEnglishSpanish

Labeling Experience

Scale AI

Multilingual Text Annotation for Conversational AI (Remotasks, 2023-2024)

Scale AITextText Generation
The Multilingual Text Annotation Conversational AI project aimed to enhance the performance of a chatbot system with high-quality multilingual training data in Spanish and English. The scope was text data annotation of NLP tasks like named entity recognition (NER), intent classification, and sentiment analysis. I tagged over 10,000 words of text, including entities such as names, locations, and organizations, and user intent tagging (e.g., complaints, questions, bookings) to increase the contextualization of the chatbot. The work included precise marking up of conversational dialogues with the intention of maintaining cultural and linguistic integrity, particularly for corpora in Spanish-language. Quality control included adherence to annotation guidelines rigorously, having an inter-annotator agreement (IAA) score of 90% or above, and regular quality control through randomized sampling checks. I used a set of annotators to ensure uniformity and participated in bi-weekly calibration se

The Multilingual Text Annotation Conversational AI project aimed to enhance the performance of a chatbot system with high-quality multilingual training data in Spanish and English. The scope was text data annotation of NLP tasks like named entity recognition (NER), intent classification, and sentiment analysis. I tagged over 10,000 words of text, including entities such as names, locations, and organizations, and user intent tagging (e.g., complaints, questions, bookings) to increase the contextualization of the chatbot. The work included precise marking up of conversational dialogues with the intention of maintaining cultural and linguistic integrity, particularly for corpora in Spanish-language. Quality control included adherence to annotation guidelines rigorously, having an inter-annotator agreement (IAA) score of 90% or above, and regular quality control through randomized sampling checks. I used a set of annotators to ensure uniformity and participated in bi-weekly calibration se

2023 - 2024
Appen

LLM Evaluation and Prompt Engineering (Appen, 2022-2023)

AppenTextEvaluation Rating
The LLM Evaluation and Prompt Engineering task sought to improve a multilingual large language model's (LLM) performance through evaluation of its output and creation of high-quality prompt-response datasets for supervised fine-tuning (SFT). The work involved rating model-generated English and Spanish text for coherence, relevance, factual accuracy, and cultural sensitivity, and annotating over 5,000 prompt-response pairs to advance the model's conversational capabilities. Individual tasks included rating model responses on a scale of 5 for quality, flagging inappropriate or biased outputs, and creating prompts to receive contextually relevant responses for customer support and information retrieval applications. The project worked with a dataset of approximately 7,500 text samples, with a special focus on maintaining linguistic nuance in Spanish responses. Quality control included at least a 95% accuracy rate in testing, confirmed through double-blind reviews, and an error rate of les

The LLM Evaluation and Prompt Engineering task sought to improve a multilingual large language model's (LLM) performance through evaluation of its output and creation of high-quality prompt-response datasets for supervised fine-tuning (SFT). The work involved rating model-generated English and Spanish text for coherence, relevance, factual accuracy, and cultural sensitivity, and annotating over 5,000 prompt-response pairs to advance the model's conversational capabilities. Individual tasks included rating model responses on a scale of 5 for quality, flagging inappropriate or biased outputs, and creating prompts to receive contextually relevant responses for customer support and information retrieval applications. The project worked with a dataset of approximately 7,500 text samples, with a special focus on maintaining linguistic nuance in Spanish responses. Quality control included at least a 95% accuracy rate in testing, confirmed through double-blind reviews, and an error rate of les

2022 - 2023

Education

N

New Mexico State University – Las Cruces, NM

Masters in Mathematics, Mathematics

Masters in Mathematics
2018 - 2020
N

New Mexico State University – Las Cruces, NM

Bachelors in Mathematics (Minor in Computer Science), Mathematics

Bachelors in Mathematics (Minor in Computer Science)
2014 - 2020

Work History

S

self employed

Freelance AI Tutor – Mathematics

remote
2023 - Present
D

Dona Ana Community College

Mathematics Instructor

Las Cruces
2020 - 2023