Arajit Samal - Senior Research Analyst - AI Advanced Mathematics & Physics

Key Skills

Software

Scale AI

Labelbox

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code Programming

Text

Top Task Types

Question Answering

Text Generation

RLHF

Computer Programming/Coding

Prompt + Response Writing (SFT)

Evaluation/Rating

Freelancer Overview

I am an AI research scientist and physicist with extensive experience in designing, evaluating, and curating high-precision datasets for training and fine-tuning large language models (LLMs). My background in advanced mathematics, physics, and complex systems enables me to develop rigorous benchmarks and annotation methodologies that probe model reasoning, abstraction, and numerical accuracy. I have systematically analyzed model outputs to identify and address reasoning failures, hallucinations, and edge-case behaviors, directly informing improvements in data labeling quality and model alignment. My work emphasizes interpretability, reliability, and correctness in AI systems, and I excel at collaborating with interdisciplinary teams to translate theoretical insights into scalable, reproducible data pipelines for high-stakes domains such as STEM, physics, and mathematical reasoning.

ExpertHindiBengaliOdia OriyaSpanishFrenchEnglishItalian

Labeling Experience

Senior Research Analyst – LLM Evaluation and Dataset Alignment

Scale AITextEvaluation Rating

As a Senior Research Analyst at Turing, I focused on the evaluation and alignment of Large Language Models (LLMs) driven by STEM, physics, and mathematics. I applied formal mathematical reasoning to develop, curate, and evaluate datasets and prompts aimed at improving LLM reasoning, generalization, and robustness. My work included designing mathematical reasoning benchmarks, systematic analysis of model outputs, and supporting safe, scalable AI systems. • Developed and validated high-difficulty math reasoning tasks for LLM evaluation. • Conducted failure-mode analysis and labeled outputs to identify hallucinations. • Enhanced dataset quality, correctness, and interpretability for LLM training. • Collaborated on scalable, reproducible AI research pipelines.

2025

AI Research Scientist – LLM Benchmarking and Dataset Curation

LabelboxTextEvaluation Rating

As an AI Research Scientist at Turing, I contributed to the design, curation, and evaluation of complex reasoning benchmarks and datasets for LLMs. I performed systematic analysis of LLM behaviors and developed high-precision mathematical datasets to support reasoning-centric model training and fine-tuning. This hands-on work ensured improvements in correctness, transparency, and safety in AI outputs. • Designed multi-step reasoning benchmarks for evaluating LLM abstraction and logic. • Labeled and analyzed output failures, hallucinations, and consistency errors. • Curated datasets and supported AI alignment with improved controllability. • Collaborated to translate theoretical insights into practical evaluation pipelines.

2024 - 2025

My Data Labelling Tools

Scale AIComputer Code ProgrammingQuestion AnsweringText Generation

Google -PhD-Evals , STEM-PDP , Amazon-CE ,Amazon-bootcamp

2022 - 2025

Education

U

University of Cambridge

Doctor of Science, Nuclear Physics

Doctor of Science

2021 - 2023

J

Jawaharlal Nehru University

Master of Science, Physics

Master of Science

2019 - 2021

Work History

T

Turing

Senior Research Analyst

Palo Alto

2025 - Present

T

Turing

AI Research Scientist

Palo Alto

2024 - 2025