Harsh Patel - AI Researcher - Video/Coding/Image/Text Dataset Curation and Annotation

Key Skills

Software

Other

Don't disclose

Top Subject Matter

Retail Security / Shoplifting Detection

AI Model Training / LLM Code Synthesis

RLHF

Top Data Types

Video

Image

Computer Code Programming

Top Task Types

Object Detection

Classification

Polygon

Text Generation

Question Answering

Text Summarization

RLHF

Fine-tuning

Evaluation/Rating

Computer Programming/Coding

Data Collection

Function Calling

Prompt + Response Writing (SFT)

Freelancer Overview

My experience in AI training data is heavily anchored by my work at Outlier.ai, where I specialized in curating high-quality Python, JavaScript, and React coding prompts to train Large Language Models (LLMs). I applied Reinforcement Learning from Human Feedback (RLHF) to meticulously evaluate, rank, and refine model-generated code, ensuring the outputs met strict enterprise-grade standards for correctness and maintainability. By labeling and annotating thousands of complex coding samples—spanning intricate algorithmic challenges, full-stack scenarios, and computer science interview problems—I actively debugged errors, rewrote flawed logic, and identified hallucinations. This hands-on validation directly enhanced model reliability and bridged the critical gap between raw AI generation and production-ready code. Beyond text and code evaluation, my data annotation expertise extends into complex computer vision domains. As an AI Researcher at Edge Signal, I curated and annotated a massive custom video dataset comprising over 5,000 clips (approximately 200 hours) to train advanced Vision Transformer (ViT) architectures for real-time anomaly detection. What sets me apart in the AI training space is my dual perspective as both a data annotator and a Senior Software/AI Engineer. Because I actively build, deploy, and benchmark the underlying MLOps pipelines and foundation models (such as TimesFM and advanced transformers), I possess a deep, structural understanding of exactly how training data impacts algorithmic performance. This engineering mindset ensures that the data I curate is perfectly optimized for high-efficiency, real-world machine learning systems.

IntermediateEnglishHindi

Labeling Experience

AI Researcher - Video Dataset Curation and Annotation

OtherVideoObject Detection

I curated and annotated a custom video dataset of over 5,000 clips, improving model robustness for shoplifting detection. The labeling process focused on identifying actions and objects under various conditions including occlusions and lighting changes. Advanced Vision Transformer architectures were used as part of model training for real-time video analysis. • Prepared a 200-hour video dataset for edge device deployment. • Applied spatiotemporal annotation tailored for action recognition tasks. • Integrated annotated datasets into benchmarking pipelines. • Enhanced action detection through high-quality labeling and curation.

2026 - Present

Competition Coders - AI Training and LLM Code Annotation

Other

I labeled and annotated thousands of code samples in Python, JavaScript, and React to fine-tune LLMs for multi-language syntax and real-world developer workflows. My responsibilities included curating coding prompts, applying RLHF, and debugging and ranking model-generated code outputs. I evaluated correctness, efficiency, and maintainability of AI-generated responses to enhance language model performance. • Labeled coding tasks across algorithmic, full-stack, and interview-style challenges. • Utilized reinforcement learning from human feedback to evaluate and refine outputs. • Fixed logic and aligned completions to production standards. • Ranked and rated large volumes of coding responses for model improvement.

2024 - 2024

Education

U

University of Ottawa

Master of Engineering, Electrical and Computer Engineering, Applied Artificial Intelligence

Master of Engineering

2024 - 2026

C

Charusat University

Bachelor of Technology, Computer Science and Engineering

Bachelor of Technology

2018 - 2022

Work History

E

Edge Signal

AI Researcher

Toronto

2026 - Present

C

Comtech Telecommunications Corp.

Data and AI R&D Co-op

Gatineau

2025 - 2025