Aryan Dadwal - LLM Evaluation Expert in English, Hindi, Punjabi and French

Key Skills

Software

AWS SageMaker

Appen

CrowdFlower

CrowdSource

Labelbox

Remotasks

Scale AI

Surge AI

Internal/Proprietary Tooling

OneForma

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code Programming

Document

Medical Dicom

Top Task Types

Computer Programming Coding

Evaluation Rating

Fine Tuning

Prompt Response Writing SFT

Translation Localization

Freelancer Overview

With hands-on experience in both Reinforcement Learning with Human Feedback (RLHF) and large-scale search quality evaluation, I bring a proven track record in AI training data operations. I’ve contributed to the alignment of Hindi language LLMs through prompt evaluation and model fine-tuning with Outlier AI, helping refine generative outputs for safety, coherence, and factual correctness. Additionally, at RWS TrainAI, I worked on geographically-aware search quality assessments—rating search results for intent match, user satisfaction, and contextual relevance. My skill set spans across diverse labeling types including intent classification, prompt ranking, query understanding, sentiment tagging, and named entity recognition. I’m comfortable working with both structured guidelines and open-ended feedback tasks, ensuring high-quality contributions even in subjective or nuanced evaluation tasks. Equipped with strong attention to detail, multilingual capabilities (Hindi, English, French), and a passion for improving AI systems, I’m well-prepared to support a wide range of projects across NLP, search, and LLM training domains.

ExpertHindiFrenchEnglishPunjabi

Labeling Experience

E-commerce Image Tagging for Visual AI

LabelboxImageBounding BoxSegmentation

Labeled 10,000+ product images for an AI-based visual recommendation engine. Tasks involved drawing bounding boxes around clothing items (shirts, shoes, accessories), labeling style categories, and segmenting background elements. Contributed to improving the visual search pipeline and product discovery on mobile apps. Maintained quality score >95% through consensus-based QA checks.

2024 - 2024

Hindi LLM Prompt Evaluation (RLHF) – Outlier AI

Scale AITextTranslation LocalizationRLHF

Evaluated Hindi language prompt completions by LLMs, rating responses on factual accuracy, coherence, safety, and alignment with human intent. Helped train the model via reinforcement learning with human feedback (RLHF) and supervised fine-tuning (SFT). Tasks involved side-by-side comparisons, instruction tuning, and adversarial response testing. Ensured high annotation quality with strict adherence to token-level accuracy and subjective alignment metrics.

2024 - 2024

earch Quality Evaluation – RWS TrainAI

Internal Proprietary ToolingTextClassificationGeocoding

Assessed the quality and relevance of search engine results based on user intent, query complexity, and location context. Conducted multiple ratings per query, including "Needs Met," content quality, and geo-specific accuracy. Handled hundreds of queries daily with consistent calibration scores above 90%.

2023 - 2024

Invisible AI

Internal Proprietary ToolingComputer Code ProgrammingComputer Programming Coding

Trained a coding model

2021 - 2024

Audio to Text Conversion

OneformaAudioText GenerationTranslation Localization

I converted 500 Hindi and English audio files to text form.

2022 - 2023

Education

I

Indian Institute of Technology Jodhpur

Bachelor of Science, Applied Artificial Intelligence And Data Science

Bachelor of Science

2020 - 2024

Work History

R

RWS – Project Callisto

Google Search Quality Rater

Remote

2025 - 2025

J

Jyovis Healthcare Solutions Pvt. Ltd.

AI Intern

On-site

2025 - 2025