For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Alain Chirino Trujillo

Alain Chirino Trujillo

Bilingual Language Engineer in AI research and design

USA flagAtlanta, GA, Usa
$23.00/hrExpertAppenData Annotation TechDataloop

Key Skills

Software

AppenAppen
Data Annotation TechData Annotation Tech
DataloopDataloop
ProdigyProdigy
SuperAnnotateSuperAnnotate

Top Subject Matter

Figurative language annotation in Spanish
Video captioning and transcription
Picture emotion detection labeling

Top Data Types

ImageImage
TextText
VideoVideo

Top Task Types

Classification
Data Collection
Entity Ner Classification
Fine Tuning
Prompt Response Writing SFT

Freelancer Overview

I have extensive experience in data labeling and AI training data, particularly in the field of Natural Language Processing (NLP). My work has involved annotating and labeling large datasets, such as over 10,000 examples for a Multilingual Euphemisms project, where I utilized advanced Corpus Linguistics techniques and tools. I have also applied machine learning methods to fine-tune language models, like BERT and RoBERTa, improving their performance on tasks such as named entity recognition. My ability to manage and process large-scale data, combined with a deep understanding of machine learning and NLP, has enabled me to deliver high-quality training data that significantly enhances model accuracy and adaptability across different domains. In addition to my hands-on experience, I have a strong academic background, including a Master's in Computational Linguistics, where I contributed to projects involving multilingual experiments and semantic network analysis. My expertise is further supported by my ability to develop and fine-tune models using techniques like Domain-Adaptive Pre-training (DAPT) and span-level masking. These skills, coupled with a solid foundation in Python, TensorFlow, and data annotation tools, set me apart in the field of AI training data, enabling me to effectively bridge the gap between data collection and model optimization.

ExpertEnglishSpanish

Labeling Experience

SuperAnnotate

Emotion Recognition Detection in Conversations using MELD

SuperannotateVideoEmotion Recognition
In this project, I focused on developing a multi-modal model for the automated detection of questionable or obscene content in children's media, utilizing multilingual methods of natural language processing, text/video analysis, and deep learning. A key aspect of my role involved fine-tuning the MARLIN model for Multimodal Emotion Detection and Recognition. This process required precise annotation skills to accurately label and integrate various modalities, including facial expressions, audio cues, and textual information, to enhance the model's emotion classification accuracy. Additionally, I served as a liaison and translator between Mexican and American research groups, facilitating effective collaboration and ensuring the successful integration of diverse perspectives into the project. My expertise in data annotation was crucial in training the model to recognize complex emotional cues across different languages and cultures.

In this project, I focused on developing a multi-modal model for the automated detection of questionable or obscene content in children's media, utilizing multilingual methods of natural language processing, text/video analysis, and deep learning. A key aspect of my role involved fine-tuning the MARLIN model for Multimodal Emotion Detection and Recognition. This process required precise annotation skills to accurately label and integrate various modalities, including facial expressions, audio cues, and textual information, to enhance the model's emotion classification accuracy. Additionally, I served as a liaison and translator between Mexican and American research groups, facilitating effective collaboration and ensuring the successful integration of diverse perspectives into the project. My expertise in data annotation was crucial in training the model to recognize complex emotional cues across different languages and cultures.

2024
Appen

Multilingual Euphemisms Project

AppenTextClassification
The Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms (PETs) project focuses on identifying and disambiguating euphemistic language across multiple languages. My contributions included collecting and annotating over 6,000 data examples and enhancing algorithms in Python to generate PETs for research. I applied machine learning techniques to train and fine-tune models like BERT and RoBERTa, improving their ability to handle the ambiguity and vagueness inherent in euphemistic language. The project aimed to enhance the performance of NLP models in multilingual contexts, with findings shared through publications and workshops​.

The Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms (PETs) project focuses on identifying and disambiguating euphemistic language across multiple languages. My contributions included collecting and annotating over 6,000 data examples and enhancing algorithms in Python to generate PETs for research. I applied machine learning techniques to train and fine-tune models like BERT and RoBERTa, improving their ability to handle the ambiguity and vagueness inherent in euphemistic language. The project aimed to enhance the performance of NLP models in multilingual contexts, with findings shared through publications and workshops​.

2022 - 2024

Education

M

Montclair State University

Master of Science, Computational Linguistics

Master of Science
2022 - 2024
U

UCP Enrique Jose Varona

Bachelors of Arts, English

Bachelors of Arts
2010 - 2015

Work History

U

University of Houston

Machine Learning Researcher (Intern)

Puebla
2024 - 2024
M

Montclair State University

Graduate Researcher Assistant

Montclair, NJ
2022 - 2024