For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Daulet Amerzhanov

Daulet Amerzhanov

AI Model Evaluator – Multimodal (image, audio, video, text) (Contract)

Kazakhstan flagAstana, Kazakhstan
$63.00/hrIntermediateOneformaScale AI

Key Skills

Software

OneFormaOneForma
Scale AIScale AI

Top Subject Matter

AI Model Evaluation (Multimodal)
AI Model Training/Evaluation (Multimodal Reasoning)
AI Model Training/Evaluation (Scientific Reasoning, Biomedical)

Top Data Types

DocumentDocument
ImageImage
TextText

Top Task Types

Diagnosis
Prompt Response Writing SFT
Question Answering
Red Teaming
Text Generation

Freelancer Overview

AI Model Evaluator – Multimodal (image, audio, video, text) (Contract). Brings 11+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Internal and Proprietary Tooling. AI-training focus includes data types such as Image and Text and labeling workflows including Evaluation, Rating, and Prompt + Response Writing (SFT).

IntermediateRussianGermanEnglish

Labeling Experience

AI Model Evaluator – Multimodal (image, audio, video, text) (Contract)

Image
Compared and ranked outputs from multiple AI models across image, audio, video, and text modalities using detailed rubrics. Evaluated image generation quality, performed object removal and inpainting tasks, and assessed audio responses for clarity and correctness. Provided structured, written justifications for model ratings and identified nuanced failure modes such as hallucinations and visual artifacts. • Multimodal evaluation included image, video, audio, and text outputs. • Prompts written to guide image editing and inpainting. • Visual grounding and instruction-following were key assessment areas. • Failure modes like hallucinations and instruction drift were documented.

Compared and ranked outputs from multiple AI models across image, audio, video, and text modalities using detailed rubrics. Evaluated image generation quality, performed object removal and inpainting tasks, and assessed audio responses for clarity and correctness. Provided structured, written justifications for model ratings and identified nuanced failure modes such as hallucinations and visual artifacts. • Multimodal evaluation included image, video, audio, and text outputs. • Prompts written to guide image editing and inpainting. • Visual grounding and instruction-following were key assessment areas. • Failure modes like hallucinations and instruction drift were documented.

2026 - Present

AI Model Trainer – Multimodal Reasoning (Contract)

ImagePrompt Response Writing SFT
Designed multimodal tasks combining images and prompts to test model performance in visual reasoning. Generated scientific images and wrote prompts requiring integration of visual and textual cues. Evaluated model outputs for accuracy, reasoning, and instruction adherence. • Tasks involved visual grounding and context understanding. • Custom scientific images and plots were generated using R. • Focused on model's multimodal reasoning abilities. • Conducted evaluation and rubric-based scoring of model outputs.

Designed multimodal tasks combining images and prompts to test model performance in visual reasoning. Generated scientific images and wrote prompts requiring integration of visual and textual cues. Evaluated model outputs for accuracy, reasoning, and instruction adherence. • Tasks involved visual grounding and context understanding. • Custom scientific images and plots were generated using R. • Focused on model's multimodal reasoning abilities. • Conducted evaluation and rubric-based scoring of model outputs.

2026 - 2026

AI Model Trainer / Prompt Engineer (Contract)

TextPrompt Response Writing SFT
Designed and refined high-difficulty scientific prompts for LLM training and evaluation. Translated complex biomedical literature, experiments, and reasoning into prompts to stress-test AI models. Evaluated outputs for factual accuracy, depth of reasoning, and hallucination risk. • Created edge cases for scientific reasoning and instruction-following. • Prompts based on peer-reviewed publications in biomedical sciences. • Focused on challenging models with multi-step and literature-based reasoning. • Improved model performance on abstraction, hypothesis evaluation, and accuracy.

Designed and refined high-difficulty scientific prompts for LLM training and evaluation. Translated complex biomedical literature, experiments, and reasoning into prompts to stress-test AI models. Evaluated outputs for factual accuracy, depth of reasoning, and hallucination risk. • Created edge cases for scientific reasoning and instruction-following. • Prompts based on peer-reviewed publications in biomedical sciences. • Focused on challenging models with multi-step and literature-based reasoning. • Improved model performance on abstraction, hypothesis evaluation, and accuracy.

2025 - 2025

Education

N

Nazarbayev University School of Medicine

Doctor of Medicine, Medicine

Doctor of Medicine
2016 - 2020
N

Nazarbayev University School of Science and Technology

Bachelor of Science, Biology

Bachelor of Science
2012 - 2016

Work History

L

Lausanne University Hospital/University of Lausanne

PhD Scientist/Bioinformatician

Lausanne
2024 - Present
H

Heart Center Endowment Fund

Event Coordinator

Astana
2022 - 2022