Amna Ali - RLHF, SFT across STEM domain

Key Skills

Software

Labelbox

OneForma

Remotasks

Scale AI

Telus

Other

Top Subject Matter

No subject matter listed

Top Data Types

Document

Image

Text

Top Label Types

Evaluation Rating

Fine Tuning

Prompt Response Writing SFT

Question Answering

RLHF

Freelancer Overview

Experience with SFT and RLHF Projects Scale AI: Initiated with physics-based evaluation projects, analyzing and rating model-generated answers for accuracy and conceptual depth. Contributed to Supervised Fine-Tuning (SFT) tasks involving Facebook conversation datasets, refining dialogue quality, coherence, and contextual consistency. Worked on video-to-text projects, generating prompts from visual content to test model comprehension and response accuracy. Rated model outputs on relevance, creativity, and factual precision to enhance conversational and reasoning capabilities. Labelbox: Designed and generated physics-related prompts to assess large language model (LLM) performance in scientific reasoning. Conducted Reinforcement Learning from Human Feedback (RLHF) tasks evaluating multiple model responses, ranking outputs, and fine-tuning models based on feedback quality. Tested and validated agent capabilities, assessing whether AI agents successfully completed complex, task-oriented objectives. Turing Collaborated with Apple on image interpretation projects, identifying and labeling visual elements, and creating descriptive prompts for multimodal AI models. Partnered with Meta, generating and evaluating prompts derived from PDFs, presentation slides, and research papers from arXiv, focusing on scientific comprehension and reasoning. Worked on identifying physics phenomena in images, linking visual data with theoretical concepts to test multimodal reasoning.

ExpertEnglish

Labeling Experience

Charxiv

OtherImageEvaluation Rating

Project: Prompt Composition and Chain-of-Thought Evaluation Developed and evaluated reasoning-focused prompts designed to test model problem-solving and critical-thinking capabilities. Each prompt required deterministic single-answer reasoning, often involving multi-step analysis such as comparison, trend detection, or pattern recognition from visual data. Ensured all prompts referenced only visual information, excluding captions or OCR text. Structured model reasoning via Chain of Thought (CoT) with 12–15 sequential, atomic steps to enhance interpretability and logical flow. Conducted RLHF evaluation, rating model reasoning for correctness, logical consistency, and visual interpretation accuracy. Produced concise, verifiable final answers (MCQs or short-form responses) to benchmark model reliability. Followed standardized linguistic, formatting, and verification protocols to ensure reproducibility and high-quality data for model fine-tuning.

2025 - 2025

Education

C

Centre for Theoretical Physics, Jamia Millia Islamia

Ph.D., Theoretical Physics

Ph.D.

2013 - 2013

A

Aligarh Muslim University

M.Sc., Physics

M.Sc.

2007 - 2007

Work History

R

Research Square Risk Lab

Postdoctoral Scientist

Dubai

2020 - 2022

D

Department of Mathematics, Jadavpur University

Research Scientist

Kolkata

2017 - 2020