For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
A
Apurva Anand

Apurva Anand

Data Science Consultant (Capstone Intern)

India flagChennai, India
IntermediateAws Sagemaker

Key Skills

Software

AWS SageMakerAWS SageMaker

Top Subject Matter

Law Enforcement Vehicle Maintenance and Fault Detection
Contract Management and Insurance Fraud Detection
Legal Services & Contract Review

Top Data Types

TextText
ImageImage
DocumentDocument

Top Task Types

Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)
Fine-tuningFine-tuning

Freelancer Overview

Data Scientist (Capstone Intern). Brings 7+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include OpenAI API, Amazon Bedrock, and AWS SageMaker. Education includes Master of Science, National University of Singapore (2023) and Bachelor of Technology, SRM Institute of Science and Technology (2020). AI-training focus includes data types such as Text and labeling workflows including Fine-tuning and Prompt + Response Writing (SFT).

Intermediate

Labeling Experience

Data Scientist/AI Engineer

TextPrompt Response Writing SFT
Built, labeled, and curated prompt–response datasets used for prompt engineering and supervised fine-tuning (SFT) of Gemini-powered bots for IT support and customer chat workflows. Defined annotation standards for prompt quality, coverage, and safe instruction-following in LLM chat. Iteratively refined annotated datasets for use in retrieval-augmented generation (RAG) and dynamic query response. • Established labeling pipelines using LangChain and internal stack for prompt management. • Conducted label audits for correctness and adherence to IT operations subject matter. • Facilitated dataset expansion through targeted scenario mining and labeling. • Coordinated with engineering team for continuous SFT data improvements.

Built, labeled, and curated prompt–response datasets used for prompt engineering and supervised fine-tuning (SFT) of Gemini-powered bots for IT support and customer chat workflows. Defined annotation standards for prompt quality, coverage, and safe instruction-following in LLM chat. Iteratively refined annotated datasets for use in retrieval-augmented generation (RAG) and dynamic query response. • Established labeling pipelines using LangChain and internal stack for prompt management. • Conducted label audits for correctness and adherence to IT operations subject matter. • Facilitated dataset expansion through targeted scenario mining and labeling. • Coordinated with engineering team for continuous SFT data improvements.

2024 - Present
AWS SageMaker

Lead Data Scientist

Aws SagemakerTextFine Tuning
Led the fine-tuning and deployment of a Hugging Face Llama-2-7B PEFT model for real-time email classification, involving hands-on annotation of emails with intent and category labels. Supervised the labeling pipeline and conducted several quality control cycles for training/evaluation split. Optimized the annotation workflow for efficiency and model performance with labeled examples driving the training dataset. • Managed annotation tools and processes on AWS SageMaker and Hugging Face Transformers stack. • Refined labeling criteria for sub-100ms inference target. • Conducted periodic reviews and relabeling for model retraining. • Facilitated knowledge transfer and annotation guideline workshops for project team.

Led the fine-tuning and deployment of a Hugging Face Llama-2-7B PEFT model for real-time email classification, involving hands-on annotation of emails with intent and category labels. Supervised the labeling pipeline and conducted several quality control cycles for training/evaluation split. Optimized the annotation workflow for efficiency and model performance with labeled examples driving the training dataset. • Managed annotation tools and processes on AWS SageMaker and Hugging Face Transformers stack. • Refined labeling criteria for sub-100ms inference target. • Conducted periodic reviews and relabeling for model retraining. • Facilitated knowledge transfer and annotation guideline workshops for project team.

2024 - 2025

Lead Data Scientist

TextFine Tuning
Fine-tuned Amazon Titan Text G1 – Express on Amazon Bedrock with custom Guardrails to ensure safe and compliant AI responses for customer messaging scenarios. Defined data labeling protocols and annotated custom prompt–response datasets for model adaptation and policy enforcement. Evaluated output quality, iterated on data labeling procedures, and validated compliance in production fine-tuning cycles. • Developed test prompts and evaluation rubrics for safe response annotation. • Coordinated team effort for scalable dataset labeling in insurance context. • Assessed and filtered labeled data for training reliability and coverage. • Led feedback sessions with QA to refine annotation guidelines and post-fine-tuning evaluation.

Fine-tuned Amazon Titan Text G1 – Express on Amazon Bedrock with custom Guardrails to ensure safe and compliant AI responses for customer messaging scenarios. Defined data labeling protocols and annotated custom prompt–response datasets for model adaptation and policy enforcement. Evaluated output quality, iterated on data labeling procedures, and validated compliance in production fine-tuning cycles. • Developed test prompts and evaluation rubrics for safe response annotation. • Coordinated team effort for scalable dataset labeling in insurance context. • Assessed and filtered labeled data for training reliability and coverage. • Led feedback sessions with QA to refine annotation guidelines and post-fine-tuning evaluation.

2024 - 2025

Data Scientist (Capstone Intern)

TextFine Tuning
Fine-tuned GPT-3.5-turbo-0613 for interactive chatbot development, specifically to improve the fault type prediction accuracy for Singapore Police Force patrol vehicles. Managed the model training workflow and supervised the labeling of conversation logs and query–response pairs for supervised fine-tuning purposes. Evaluated resulting model improvements and integrated model into an end-user Gradio app for prediction tasks. • Performed label quality assurance and model evaluation with the improved dataset. • Coordinated with stakeholders to define correct labels aligned to fault types and scenarios. • Documented iterations and prepared change logs for each fine-tuning cycle. • Collaborated on post-deployment assessments to measure annotation-driven improvements.

Fine-tuned GPT-3.5-turbo-0613 for interactive chatbot development, specifically to improve the fault type prediction accuracy for Singapore Police Force patrol vehicles. Managed the model training workflow and supervised the labeling of conversation logs and query–response pairs for supervised fine-tuning purposes. Evaluated resulting model improvements and integrated model into an end-user Gradio app for prediction tasks. • Performed label quality assurance and model evaluation with the improved dataset. • Coordinated with stakeholders to define correct labels aligned to fault types and scenarios. • Documented iterations and prepared change logs for each fine-tuning cycle. • Collaborated on post-deployment assessments to measure annotation-driven improvements.

2023 - 2023

Education

N

National University of Singapore

Master of Science, Business Analytics

Master of Science
2022 - 2023
S

SRM Institute of Science and Technology

Bachelor of Technology, Information Technology

Bachelor of Technology
2016 - 2020

Work History

F

Ford Motor

Data Scientist

Chennai
2025 - Present
T

Trianz Digital Consulting

Lead Data Scientist

Chennai
2024 - 2025