For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
A
Ananya G.

Ananya G.

RHLF specialist, 1 year improving AI models with human feedback

India flagPune, India

Key Skills

Software

CrowdSourceCrowdSource
Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

AudioAudio
DocumentDocument
TextText

Top Task Types

Evaluation/RatingEvaluation/Rating
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)
RLHFRLHF

Freelancer Overview

Over the past year, I’ve worked on AI training through Reinforcement Learning with Human Feedback (RHLF), where I evaluated and ranked AI-generated responses to make them more accurate, safe, and user-friendly. This role sharpened my eye for detail and taught me how to balance consistency with judgment, especially when dealing with grey areas or ambiguous outputs. What makes me different is that I don’t just label data — I bring in my background in business analysis, digitalisation, and sustainability to add context and depth. I enjoy finding patterns, refining processes, and making sure the data used to train models is not only high-quality but also aligned with how people actually think and communicate. For me, it’s about shaping AI systems that are reliable, ethical, and genuinely helpful.

Labeling Experience

RHLF for LLMs

Internal/Proprietary ToolingTextTextRLHFRLHFEvaluation/RatingEvaluation/Rating

One of the key projects I worked on involved document-based datasets for AI fine-tuning through Reinforcement Learning with Human Feedback (RHLF). My role was to evaluate, rate, and rank AI-generated responses to text prompts so they could be used to train and improve large language models. The scope covered reviewing documents across different domains — from business and technical contexts to general knowledge — and assessing them on parameters like factual accuracy, coherence, clarity, tone, and safety. To ensure consistency, I followed detailed labelling guidelines that outlined how to score responses, handle ambiguous or borderline cases, and flag content that was harmful, biased, or irrelevant. In situations where guidelines didn’t fully address an edge case, I documented my reasoning to support improvements in the rules. Over the course of a year, I worked on several thousand responses, maintaining both speed and quality. This experience taught me how critical well-structured, h

2024

Education

U

University of Sussex

Master of Science, Strategic Innovation Management

Master of Science
2021 - 2023
M

Manchester Metropolitan University

Master of Science, Industrial Digitalisation

Master of Science
2019 - 2020

Work History

Z

Zummit Infolabs

Lead Business Analyst

Pune
2024 - Present
N

N/A

ESG Consultant

London
2023 - 2023