For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
A
Arajit Samal

Arajit Samal

Senior Research Analyst - AI Advanced Mathematics & Physics

India flagNew Delhi , India
$50.00/hrExpertScale AILabelbox

Key Skills

Software

Scale AIScale AI
LabelboxLabelbox

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code ProgrammingComputer Code Programming
TextText

Top Task Types

Question AnsweringQuestion Answering
Text GenerationText Generation
RLHFRLHF
Computer Programming/CodingComputer Programming/Coding
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)
Evaluation/RatingEvaluation/Rating

Freelancer Overview

I am an AI research scientist and physicist with extensive experience in designing, evaluating, and curating high-precision datasets for training and fine-tuning large language models (LLMs). My background in advanced mathematics, physics, and complex systems enables me to develop rigorous benchmarks and annotation methodologies that probe model reasoning, abstraction, and numerical accuracy. I have systematically analyzed model outputs to identify and address reasoning failures, hallucinations, and edge-case behaviors, directly informing improvements in data labeling quality and model alignment. My work emphasizes interpretability, reliability, and correctness in AI systems, and I excel at collaborating with interdisciplinary teams to translate theoretical insights into scalable, reproducible data pipelines for high-stakes domains such as STEM, physics, and mathematical reasoning.

ExpertHindiBengaliOdia OriyaSpanishFrenchEnglishItalian

Labeling Experience

Scale AI

Senior Research Analyst – LLM Evaluation and Dataset Alignment

Scale AITextEvaluation Rating
As a Senior Research Analyst at Turing, I focused on the evaluation and alignment of Large Language Models (LLMs) driven by STEM, physics, and mathematics. I applied formal mathematical reasoning to develop, curate, and evaluate datasets and prompts aimed at improving LLM reasoning, generalization, and robustness. My work included designing mathematical reasoning benchmarks, systematic analysis of model outputs, and supporting safe, scalable AI systems. • Developed and validated high-difficulty math reasoning tasks for LLM evaluation. • Conducted failure-mode analysis and labeled outputs to identify hallucinations. • Enhanced dataset quality, correctness, and interpretability for LLM training. • Collaborated on scalable, reproducible AI research pipelines.

As a Senior Research Analyst at Turing, I focused on the evaluation and alignment of Large Language Models (LLMs) driven by STEM, physics, and mathematics. I applied formal mathematical reasoning to develop, curate, and evaluate datasets and prompts aimed at improving LLM reasoning, generalization, and robustness. My work included designing mathematical reasoning benchmarks, systematic analysis of model outputs, and supporting safe, scalable AI systems. • Developed and validated high-difficulty math reasoning tasks for LLM evaluation. • Conducted failure-mode analysis and labeled outputs to identify hallucinations. • Enhanced dataset quality, correctness, and interpretability for LLM training. • Collaborated on scalable, reproducible AI research pipelines.

2025
Labelbox

AI Research Scientist – LLM Benchmarking and Dataset Curation

LabelboxTextEvaluation Rating
As an AI Research Scientist at Turing, I contributed to the design, curation, and evaluation of complex reasoning benchmarks and datasets for LLMs. I performed systematic analysis of LLM behaviors and developed high-precision mathematical datasets to support reasoning-centric model training and fine-tuning. This hands-on work ensured improvements in correctness, transparency, and safety in AI outputs. • Designed multi-step reasoning benchmarks for evaluating LLM abstraction and logic. • Labeled and analyzed output failures, hallucinations, and consistency errors. • Curated datasets and supported AI alignment with improved controllability. • Collaborated to translate theoretical insights into practical evaluation pipelines.

As an AI Research Scientist at Turing, I contributed to the design, curation, and evaluation of complex reasoning benchmarks and datasets for LLMs. I performed systematic analysis of LLM behaviors and developed high-precision mathematical datasets to support reasoning-centric model training and fine-tuning. This hands-on work ensured improvements in correctness, transparency, and safety in AI outputs. • Designed multi-step reasoning benchmarks for evaluating LLM abstraction and logic. • Labeled and analyzed output failures, hallucinations, and consistency errors. • Curated datasets and supported AI alignment with improved controllability. • Collaborated to translate theoretical insights into practical evaluation pipelines.

2024 - 2025
Scale AI

My Data Labelling Tools

Scale AIComputer Code ProgrammingQuestion AnsweringText Generation
Google -PhD-Evals , STEM-PDP , Amazon-CE ,Amazon-bootcamp

Google -PhD-Evals , STEM-PDP , Amazon-CE ,Amazon-bootcamp

2022 - 2025

Education

U

University of Cambridge

Doctor of Science, Nuclear Physics

Doctor of Science
2021 - 2023
J

Jawaharlal Nehru University

Master of Science, Physics

Master of Science
2019 - 2021

Work History

T

Turing

Senior Research Analyst

Palo Alto
2025 - Present
T

Turing

AI Research Scientist

Palo Alto
2024 - 2025