For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Abolfazl Moradian (ilia)

Abolfazl Moradian (ilia)

LLM Evaluation & AI Training Specialist (Contract)

Finland flagTehran, Finland
$20.00/hrIntermediateAppenInternal Proprietary ToolingLabel Studio

Key Skills

Software

AppenAppen
Internal/Proprietary Tooling
Label StudioLabel Studio

Top Subject Matter

STEM Reasoning
AI Model Evaluation
Code Generation

Top Data Types

TextText
ImageImage
VideoVideo

Top Task Types

Bounding Box
Evaluation Rating
RLHF
Classification
Computer Programming Coding
Prompt Response Writing SFT
Question Answering
Transcription
Red Teaming

Freelancer Overview

I've been working on AI training and LLM evaluation projects through both Outlier, Meridial, and CrowdGen since early 2024, evaluating frontier model outputs across STEM reasoning, code generation, SQL validation, and operations research tasks. My work includes writing benchmark-quality prompts, designing multi-dimensional scoring rubrics, running head-to-head model comparisons, and identifying hallucinations and reasoning failures, especially in technical domains like supply chain optimization and quantitative modeling. What sets me apart is that I'm not just a reviewer, I have 12+ years of industrial engineering experience plus published peer-reviewed research, so when a model makes a subtle domain error, misses an assumption, or produces plausible-sounding but flawed logic, I actually catch it. I'm comfortable working in Python, SQL, Power BI, and statistical modeling, and I'm used to the kind of structured, evidence-based evaluation that high-quality AI training data demands.

IntermediateEnglishPersian FarsiFinnish

Labeling Experience

Audio Transcription & Speech Quality Evaluation

AudioTranscription
Transcribed audio clips and rated AI-generated speech for things like clarity and naturalness. Flagged obvious issues like wrong words or awkward pronunciation. Being bilingual (English/Persian) was useful for some of the tasks.

Transcribed audio clips and rated AI-generated speech for things like clarity and naturalness. Flagged obvious issues like wrong words or awkward pronunciation. Being bilingual (English/Persian) was useful for some of the tasks.

2024 - Present

Video Content Annotation & Temporal Event Labeling

VideoClassification
Labeled video segments, tagging events, classifying scenes, and marking timestamps based on the project's categories. Some QA review of other annotators' labels was involved too. Detail-oriented work that required patience and following guidelines closely.

Labeled video segments, tagging events, classifying scenes, and marking timestamps based on the project's categories. Some QA review of other annotators' labels was involved too. Detail-oriented work that required patience and following guidelines closely.

2024 - Present

Image Classification & Object Annotation for AI Model Training

ImageClassification
Did image labeling work, classifying objects, drawing bounding boxes, and checking label consistency across batches. Followed the project guidelines, flagged unclear cases, and did my best to keep annotations clean and accurate. Nothing fancy, but the kind of careful, repetitive work I'm comfortable with from my quality management background.

Did image labeling work, classifying objects, drawing bounding boxes, and checking label consistency across batches. Followed the project guidelines, flagged unclear cases, and did my best to keep annotations clean and accurate. Nothing fancy, but the kind of careful, repetitive work I'm comfortable with from my quality management background.

2024 - Present

LLM Response Evaluation & Text Annotation for AI Training

TextEvaluation Rating
I evaluate LLM-generated responses on Outlier and Meridial projects, mostly in STEM, Python/SQL code, and operations research. The work involves scoring outputs against rubrics, catching hallucinations, comparing models side by side, and writing prompts that push models to their limits. I've worked across several projects (Hickory, El Dorado, Aether, Phoenix), each with its own guidelines and calibration process. My engineering background helps me spot domain-specific errors that a general reviewer would likely miss.

I evaluate LLM-generated responses on Outlier and Meridial projects, mostly in STEM, Python/SQL code, and operations research. The work involves scoring outputs against rubrics, catching hallucinations, comparing models side by side, and writing prompts that push models to their limits. I've worked across several projects (Hickory, El Dorado, Aether, Phoenix), each with its own guidelines and calibration process. My engineering background helps me spot domain-specific errors that a general reviewer would likely miss.

2024 - Present

Education

L

LUT University

Master of Science, Industrial Engineering and Management

Master of Science
2024 - 2026
A

Azad University

Bachelor of Science, Industrial Engineering

Bachelor of Science
2007 - 2007

Work History

V

Various Clients

Data Analyst

Tehran
2020 - Present
P

Payun Tejarat Pars

Supply Chain and Quality Analyst

Tehran
2022 - 2024