For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
A

Ali Abidi

AI Data Evaluator

USA flagUSA, Usa
$50.00/hrExpertMicro1

Key Skills

Software

Micro1

Top Subject Matter

Search-augmented LLMs
macroeconomic and geopolitical events
Legal Services & Contract Review

Top Data Types

TextText
DocumentDocument

Top Task Types

RLHF

Freelancer Overview

I’ve worked directly on training and evaluating search-augmented LLMs through adversarial testing, prompt engineering, and RLHF-based workflows. My focus was on systematically breaking models and documenting failure modes at a granular level, specifically hallucinations, temporal reasoning errors in live news environments, and retrieval gaps tied to indexing latency or poor source selection. I designed targeted, time-sensitive prompts across domains like macroeconomic releases and geopolitical events, then evaluated outputs against primary sources to assess factual accuracy, alignment, and instruction adherence. Beyond evaluation, I built structured grading rubrics and produced detailed conversation-level analyses to support model improvement and fine-tuning. My workflow emphasizes consistency, edge-case identification, and high-signal documentation, ensuring outputs are both scalable and actionable for training pipelines. What sets me apart is the combination of real-time reasoning, precision under uncertainty, and the ability to translate complex model behavior into clear, structured feedback that directly improves system performance.

ExpertEnglish

Labeling Experience

AI Data Evaluator (Contract)

Micro1TextRLHF
As an AI Data Evaluator at Micro1 / SearchEvals, I engineered adversarial and time-sensitive prompts to stress-test search-augmented large language models (LLMs). I evaluated model outputs for factual accuracy and documented the failure modes that surfaced, providing detailed analyses to improve AI systems. I designed grading rubrics and evaluation frameworks, and performed real-time fact-checking against primary sources during critical events. • Stress-tested LLMs with adversarial and real-world data prompts. • Conducted evaluation for hallucinations, temporal confusion, and retrieval failures. • Developed standardized scoring frameworks for model output assessment. • Authored conversation analyses to document and classify model errors.

As an AI Data Evaluator at Micro1 / SearchEvals, I engineered adversarial and time-sensitive prompts to stress-test search-augmented large language models (LLMs). I evaluated model outputs for factual accuracy and documented the failure modes that surfaced, providing detailed analyses to improve AI systems. I designed grading rubrics and evaluation frameworks, and performed real-time fact-checking against primary sources during critical events. • Stress-tested LLMs with adversarial and real-world data prompts. • Conducted evaluation for hallucinations, temporal confusion, and retrieval failures. • Developed standardized scoring frameworks for model output assessment. • Authored conversation analyses to document and classify model errors.

2026 - 2026

Education

G

Georgia State University — J. Mack Robinson College of Business

Bachelor of Business Administration, Business Administration (Entrepreneurship)

Bachelor of Business Administration
2024 - 2024

Work History

D

Dhow

Co-Founder

USA
2025 - Present
S

Self-Employed

Independent Contractor — Derivatives Trader

Atlanta
2020 - Present