Arpan Gupta - AI Model Evaluation & Prompt Engineering

Key Skills

Software

Labelbox

Other

Top Subject Matter

Prompt Engineering and LLM Evaluation

Top Data Types

Audio

Text

Top Task Types

Transcription

Freelancer Overview

I have hands-on experience working on multiple AI training and data evaluation projects across platforms like Alignerr and similar ecosystems. My work includes: Claude Code CLI CHP Transcript Evaluation – Evaluated AI-generated coding transcripts, focusing on correctness, reasoning quality, and adherence to instructions. This involved identifying logical errors, hallucinations, and improving response reliability. Agentic Coding Response Evaluation – Assessed multi-step AI-generated coding solutions, ensuring proper reasoning, code accuracy, and alignment with user intent. Gained experience in evaluating structured outputs and debugging AI-generated code. CC (Claude Code) Evaluation Projects – Reviewed and rated model outputs for clarity, coherence, and technical accuracy, contributing to improving LLM performance. ATC Transcription Project (Alignerr) – Worked on transcription and audio labeling tasks, ensuring high accuracy, proper formatting, and alignment with guidelines. Through these projects, I have developed strong skills in: AI response evaluation and quality control Annotation and transcription accuracy Understanding of LLM behavior, reasoning, and failure cases Following strict project guidelines and maintaining consistency

Entry LevelEnglishHindi

Labeling Experience

Labeller

LabelboxComputer Code ProgrammingComputer Programming Coding

I have contributed to multiple AI training and evaluation projects focused on improving large language model (LLM) performance, particularly in coding, reasoning, and transcription domains. My work involved evaluating AI-generated responses for correctness, logical consistency, and adherence to instructions, especially in projects such as Claude Code CLI CHP Transcript Evaluation and agentic coding response evaluation. In these projects, I analyzed multi-step coding solutions, identified errors in reasoning or implementation, and ensured outputs met high-quality standards. I also worked on transcription tasks (ATC project on Alignerr), where I maintained accuracy, formatting consistency, and compliance with detailed annotation guidelines. Through this experience, I developed strong skills in LLM evaluation, quality assurance, annotation workflows, and understanding model behavior, enabling me to contribute effectively to AI training pipelines.

2026 - Present

Education

M

Meerut Institute of Engineering and Technology

Bachelor of Technology, Computer Science

Bachelor of Technology

2022 - 2026

A

Asha Modern School

Intermediate School Certificate, Science

Intermediate School Certificate

2020 - 2022

Work History

B

Bharti Airtel

Backend Developer Intern

Gurugram

2025 - 2025