For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
A
Anahita Gopinath Elias

Anahita Gopinath Elias

AI Model Evaluation Contractor

India flagDelhi NCR, India
$9.00/hrIntermediateScale AILabel StudioGoogle Cloud Vertex AI

Key Skills

Software

Scale AIScale AI
Label StudioLabel Studio
Google Cloud Vertex AIGoogle Cloud Vertex AI

Top Subject Matter

Large Language Model Output Evaluation
Conversational AI / NLP Model Training
Code Generation & Review

Top Data Types

TextText
AudioAudio
VideoVideo

Top Task Types

Data CollectionData Collection
Evaluation/RatingEvaluation/Rating
Computer Programming/CodingComputer Programming/Coding

Freelancer Overview

B.Tech student with hands-on experience in AI model evaluation and data annotation across text-based workflows. Skilled in evaluation, rating, and data collection tasks, with practical exposure to platforms like Scale AI and Label Studio. Known for quality-focused execution and attention to detail in labeling pipelines. Data Labeling/AI Training Experience: Worked on text annotation and model evaluation projects involving rating AI-generated responses for quality, accuracy, and coherence. Experienced in structured data collection workflows and applying labeling guidelines consistently across large datasets. Studying at Shiv Nadar Institution Of Eminence Deemed To Be University.

IntermediateEnglish

Labeling Experience

Scale AI

AI Model Evaluation Contractor

Scale AIText
As an AI Model Evaluation Contractor, I evaluated and rated large language model (LLM) outputs across six task categories using structured rubrics. I designed test prompts and scoring guidelines to assess model performance and reported on failure modes for future fine-tuning. I collaborated closely with a team to ensure consistent and high-quality inter-annotator agreement. • Completed over 2,400 output evaluation tasks, focusing on instruction-following, factual accuracy, and code quality. • Designed edge-case prompts and rubrics to systematically probe multi-step and ambiguous input handling by models. • Documented more than 140 model failure cases in structured feedback for reinforcement learning from human feedback (RLHF) prioritization. • Maintained inter-annotator agreement above 0.88 Cohen's Kappa through joint review and guideline sharing.

As an AI Model Evaluation Contractor, I evaluated and rated large language model (LLM) outputs across six task categories using structured rubrics. I designed test prompts and scoring guidelines to assess model performance and reported on failure modes for future fine-tuning. I collaborated closely with a team to ensure consistent and high-quality inter-annotator agreement. • Completed over 2,400 output evaluation tasks, focusing on instruction-following, factual accuracy, and code quality. • Designed edge-case prompts and rubrics to systematically probe multi-step and ambiguous input handling by models. • Documented more than 140 model failure cases in structured feedback for reinforcement learning from human feedback (RLHF) prioritization. • Maintained inter-annotator agreement above 0.88 Cohen's Kappa through joint review and guideline sharing.

2025 - Present
Label Studio

Undergraduate ML Research Assistant

Label StudioTextData Collection
As an Undergraduate ML Research Assistant, I gathered, cleaned, and prepared a large conversational dataset for training a dialogue-intent classification model. I contributed to data preprocessing workflows and supported the research team in assembling high-quality datapoints. My work ensured the dataset's readiness for downstream training and experimentation. • Collected and preprocessed 50,000 samples of conversational text data for model training. • Performed manual data cleaning and organization under the supervision of faculty and PhD researchers. • Implemented structured labeling criteria for dialogue intent classification tasks. • Contributed to a literature review on transformer-based NLP fine-tuning for research dissemination.

As an Undergraduate ML Research Assistant, I gathered, cleaned, and prepared a large conversational dataset for training a dialogue-intent classification model. I contributed to data preprocessing workflows and supported the research team in assembling high-quality datapoints. My work ensured the dataset's readiness for downstream training and experimentation. • Collected and preprocessed 50,000 samples of conversational text data for model training. • Performed manual data cleaning and organization under the supervision of faculty and PhD researchers. • Implemented structured labeling criteria for dialogue intent classification tasks. • Contributed to a literature review on transformer-based NLP fine-tuning for research dissemination.

2025 - 2026

Education

S

Shiv Nadar Institution of Eminence Deemed To Be University

Bachelor of Technology, Mechanical Engineering

Bachelor of Technology
2025 - 2029

Work History

S

Shiv Nadar AI Lab

Undergraduate ML Research Assistant

Delhi NCR
2025 - 2026
G

Game Design Plus

Software Engineering Intern

Chennai
2024 - 2024