For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Vikash Meena

Vikash Meena

AI Trainer | Data Labeling | LLM Evaluation | Python & Machine Learning

INDIA flag
New Delhi, India
$25.00/hrIntermediateScale AI

Key Skills

Software

Scale AIScale AI

Top Subject Matter

Artificial Intelligence – LLM Training & Data Annotation
Software & Programming – Python / Machine Learning
Data Science – NLP & Data Analysis

Top Data Types

TextText

Top Label Types

Classification
Evaluation Rating

Freelancer Overview

AI Trainer and Data Labeling Specialist with a strong background in machine learning and structured dataset preparation from IIT Delhi. Experienced in evaluating and annotating technical datasets, reviewing LLM outputs, and improving training data quality through response ranking, prompt evaluation, and rubric-based scoring. Skilled in identifying hallucinations, verifying information, and producing high-quality labeled datasets used in AI training workflows. Hands-on experience building ML systems across NLP, Computer Vision, Time-Series Forecasting, and Retrieval-Augmented Generation (RAG). Developed projects including LSTM-based stock prediction, sensor-driven emission forecasting models, CNN image classifiers, NLP text classification systems, and an Agentic RAG pipeline with vector databases and semantic routing. Proficient in Python, SQL, Pandas, NumPy, Scikit-learn, and TensorFlow with strong expertise in feature engineering, data validation, and model evaluation. Experienced with web-based annotation interfaces, spreadsheet-based labeling workflows, and structured dataset verification.

IntermediateEnglishJapanesePunjabiHindi

Labeling Experience

Python Code Review & Annotation – AI Training Dataset

Computer Code ProgrammingComputer Programming Coding
Reviewed and annotated Python code examples for AI training datasets. Tasks included validating code correctness, categorizing programming tasks, identifying logical or syntax errors, and assigning quality ratings. Verified outputs and ensured clear, structured annotations to support machine learning model training and evaluation.

Reviewed and annotated Python code examples for AI training datasets. Tasks included validating code correctness, categorizing programming tasks, identifying logical or syntax errors, and assigning quality ratings. Verified outputs and ensured clear, structured annotations to support machine learning model training and evaluation.

2025 - 2026

LLM Question Answering Dataset Evaluation – NLP & RAG Systems

TextQuestion Answering
Worked on question-answering datasets involving prompt–response evaluation and validation of model-generated answers. Tasks included reviewing prompts and generated responses, verifying factual correctness, identifying incomplete or hallucinated answers, and improving response quality through structured evaluation. Evaluated responses based on reasoning clarity, relevance to the prompt, and instruction-following, while providing short justification comments for quality scoring. Cross-checked information against reliable sources when needed to ensure accuracy. This work was performed as part of NLP and Retrieval-Augmented Generation (RAG) experimentation, where question-answer datasets were used to improve response quality and retrieval performance in LLM systems.

Worked on question-answering datasets involving prompt–response evaluation and validation of model-generated answers. Tasks included reviewing prompts and generated responses, verifying factual correctness, identifying incomplete or hallucinated answers, and improving response quality through structured evaluation. Evaluated responses based on reasoning clarity, relevance to the prompt, and instruction-following, while providing short justification comments for quality scoring. Cross-checked information against reliable sources when needed to ensure accuracy. This work was performed as part of NLP and Retrieval-Augmented Generation (RAG) experimentation, where question-answer datasets were used to improve response quality and retrieval performance in LLM systems.

2025 - 2025

AI Trainer – RLHF Data Labeling & LLM Evaluation (Scale AI)

TextRLHF
Worked on AI training datasets focused on Reinforcement Learning from Human Feedback (RLHF) workflows. Tasks included reviewing prompt–response pairs generated by large language models, ranking candidate responses based on reasoning quality, factual accuracy, and instruction-following, and providing short justification comments according to defined evaluation rubrics. Performed dataset annotation and quality assurance on technical and STEM-related prompts, identifying hallucinations, logical inconsistencies, formatting issues, and incomplete reasoning in model outputs. Ensured high-quality labeled datasets by applying consistent scoring criteria and verifying responses against reliable sources when needed. Used web-based annotation interfaces and spreadsheet-based workflows to review, categorize, and structure labeled data. This work contributed to improving training datasets used for large language model alignment and performance evaluation.

Worked on AI training datasets focused on Reinforcement Learning from Human Feedback (RLHF) workflows. Tasks included reviewing prompt–response pairs generated by large language models, ranking candidate responses based on reasoning quality, factual accuracy, and instruction-following, and providing short justification comments according to defined evaluation rubrics. Performed dataset annotation and quality assurance on technical and STEM-related prompts, identifying hallucinations, logical inconsistencies, formatting issues, and incomplete reasoning in model outputs. Ensured high-quality labeled datasets by applying consistent scoring criteria and verifying responses against reliable sources when needed. Used web-based annotation interfaces and spreadsheet-based workflows to review, categorize, and structure labeled data. This work contributed to improving training datasets used for large language model alignment and performance evaluation.

2024 - 2025

AI Trainer – Data Labeling & LLM Evaluation (Scale AI)

TextEvaluation Rating
AI training and data labeling work focused on evaluating technical datasets and improving LLM output quality. Tasks included reviewing model-generated responses, categorizing outputs, assigning quality ratings, and providing short justification comments using rubric-based evaluation guidelines. Worked on STEM and technical datasets involving prompt-response evaluation, reasoning verification, and response ranking for RLHF training pipelines. Responsibilities included identifying hallucinations, correcting logical inconsistencies, and ensuring instruction-following and factual accuracy in model outputs. Used web-based annotation interfaces and spreadsheet-based workflows to review and structure labeled data. Maintained high-quality dataset standards by cross-checking outputs, applying consistent scoring criteria, and ensuring clarity and correctness of annotations.

AI training and data labeling work focused on evaluating technical datasets and improving LLM output quality. Tasks included reviewing model-generated responses, categorizing outputs, assigning quality ratings, and providing short justification comments using rubric-based evaluation guidelines. Worked on STEM and technical datasets involving prompt-response evaluation, reasoning verification, and response ranking for RLHF training pipelines. Responsibilities included identifying hallucinations, correcting logical inconsistencies, and ensuring instruction-following and factual accuracy in model outputs. Used web-based annotation interfaces and spreadsheet-based workflows to review and structure labeled data. Maintained high-quality dataset standards by cross-checking outputs, applying consistent scoring criteria, and ensuring clarity and correctness of annotations.

2024 - 2025

Text Classification Dataset Annotation – NLP Training Data

TextClassification
Annotated and classified text datasets for NLP model training. Tasks included labeling text into categories, validating dataset quality, identifying incorrect labels, and preparing structured training data for machine learning models.

Annotated and classified text datasets for NLP model training. Tasks included labeling text into categories, validating dataset quality, identifying incorrect labels, and preparing structured training data for machine learning models.

2024 - 2024

Education

I

Indian Institute of technology Delhi

BTech, AI Ml and Production Industrial Engineering

BTech
2022 - 2026

Work History

No Work History added yet

Vikash M. hasn’t added any Work History to their OpenTrain profile yet.