Junia Yara Penachioni - AI Trainer - Machine Learning & Data Annotation

Key Skills

Software

Appen

Labelbox

Scale AI

Mercor

Top Subject Matter

No subject matter listed

Top Data Types

Video

Text

Image

Audio

Top Task Types

Classification

Text Generation

Evaluation/Rating

Prompt + Response Writing (SFT)

Question Answering

Freelancer Overview

I am a PhD-trained biologist with extensive experience in data annotation and labeling for large language models, specializing in projects that span text, image, video, and audio modalities. My background includes hands-on work with AI training data for STEM, medical, and scientific domains, where I have contributed to projects such as Project Kaleidoscope, Project Phoenix, and Project Hedgehog on platforms like Labelbox and Appen. I excel at evaluating and synthesizing complex scientific and technical content, ensuring high-quality and accurate data for machine learning models. Fluent in English, Italian, and Portuguese, I bring strong attention to detail, advanced analytical skills, and a deep understanding of research workflows, making me adept at improving AI systems through precise annotation, content evaluation, and text classification.

Entry LevelItalianPortugueseEnglish

Labeling Experience

Data Annotator – LLM AI Projects (Alignerr/Labelbox AI)

LabelboxTextEvaluation Rating

As a data annotator for LLM AI projects at Alignerr using Labelbox AI, I specialized in text evaluation and label quality control. My responsibilities included reviewing and rating textual data produced by AI models. I applied structured evaluation criteria to annotate content for model improvement. • Labeled and rated diverse text samples • Ensured quality and clarity in AI-generated outputs • Used Labelbox AI to manage annotation workflow • Collaborated remotely with a distributed team.

2026

Data Annotator – STEM-Biology (Handshake)

MercorTextClassification

As a data annotator for Handshake, I contributed to the enrichment of LLMs in STEM and Biology topics. My main duty was to categorize and label academic texts for accuracy and content appropriateness. Rigorous annotation improved AI model reliability in biological sciences tasks. • Reviewed and classified STEM-Biology educational texts • Applied taxonomy standards for text categorization • Engaged in iterative annotation cycles with feedback • Employed project-specific annotation systems.

2025

Data Annotator – Project Kaleidoscope (Outlier AI)

Scale AITextClassification

As a data annotator for Outlier AI on Project Kaleidoscope, I focused on classifying STEM/Biology texts to train large language models for domain accuracy. My work emphasized subject-matter relevance, clarity, and alignment with academic standards. This contributed to enhancing LLM understanding of complex biology concepts. • Annotated and classified STEM/Biology textual data • Ensured precise mapping of biology topics to model output • Utilized standard annotation tools and protocols • Liaised with project leads for domain alignment.

2025

Labeler

Scale AITextEvaluation RatingPrompt Response Writing SFT

Prompt and evaluation/rating response

2025

Annotator

LabelboxTextText Generation

Generates conversation with the model about different fields.

2025

Education

F

Fundação Getúlio Vargas

Master of Business Administration, Project Management

Master of Business Administration

2014 - 2016

U

University of Turin – IRCC

Doctor of Philosophy, Cell Science and Technologies

Doctor of Philosophy

2004 - 2008

Work History

C

COI Institute

Education Coordinator

Rio de Janeiro

2013 - 2016

I

ITT

Postdoctoral Researcher

Florence

2010 - 2012