Ananya G.

RHLF specialist, 1 year improving AI models with human feedback

Pune, India

Key Skills

Software

CrowdSource

Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Document

Text

Top Task Types

Evaluation/Rating

Prompt + Response Writing (SFT)

RLHF

Freelancer Overview

Over the past year, I’ve worked on AI training through Reinforcement Learning with Human Feedback (RHLF), where I evaluated and ranked AI-generated responses to make them more accurate, safe, and user-friendly. This role sharpened my eye for detail and taught me how to balance consistency with judgment, especially when dealing with grey areas or ambiguous outputs. What makes me different is that I don’t just label data — I bring in my background in business analysis, digitalisation, and sustainability to add context and depth. I enjoy finding patterns, refining processes, and making sure the data used to train models is not only high-quality but also aligned with how people actually think and communicate. For me, it’s about shaping AI systems that are reliable, ethical, and genuinely helpful.

Labeling Experience

RHLF for LLMs

Internal/Proprietary Tooling

Text

RLHF

Evaluation/Rating

One of the key projects I worked on involved document-based datasets for AI fine-tuning through Reinforcement Learning with Human Feedback (RHLF). My role was to evaluate, rate, and rank AI-generated responses to text prompts so they could be used to train and improve large language models. The scope covered reviewing documents across different domains — from business and technical contexts to general knowledge — and assessing them on parameters like factual accuracy, coherence, clarity, tone, and safety. To ensure consistency, I followed detailed labelling guidelines that outlined how to score responses, handle ambiguous or borderline cases, and flag content that was harmful, biased, or irrelevant. In situations where guidelines didn’t fully address an edge case, I documented my reasoning to support improvements in the rules. Over the course of a year, I worked on several thousand responses, maintaining both speed and quality. This experience taught me how critical well-structured, h

2024

Education

University of Sussex

Master of Science, Strategic Innovation Management

Master of Science

2021 - 2023

Manchester Metropolitan University

Master of Science, Industrial Digitalisation

Master of Science

2019 - 2020

Work History

Zummit Infolabs

Lead Business Analyst

Pune

2024 - Present

N/A

ESG Consultant

London

2023 - 2023