For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
M
Max

Max

Software Engineer (20 years) / AI Coding Agent Evaluation / RL Environments Engineer

Israel flagTel Aviv, Israel
$40.00/hrIntermediateOtherMindriftMercor

Key Skills

Software

Other
MindriftMindrift
MercorMercor
TolokaToloka
Google Cloud Vertex AIGoogle Cloud Vertex AI
Internal/Proprietary Tooling

Top Subject Matter

AI coding agents and software engineering assessment
Computer-use expert
RL Environments engineer

Top Data Types

Computer Code ProgrammingComputer Code Programming
DocumentDocument
TextText

Top Task Types

Computer Programming/CodingComputer Programming/Coding
Data CollectionData Collection
Evaluation/RatingEvaluation/Rating
RLHFRLHF
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)
Question AnsweringQuestion Answering
Red TeamingRed Teaming
Fine-tuningFine-tuning
Text SummarizationText Summarization

Freelancer Overview

At Mindrift, I design adversarial evaluation tasks for Claude Opus 4.6 within simulated company environments — complete with Python and TypeScript repositories, Jira tickets, documentation, and Slack messages. My goal is to craft prompts calibrated so the agent fails roughly half the time, then write automated end-to-end validation using the pytest framework, including system tests and AST-based code verification. After each agent run, I perform deep analysis of the diffs and transcripts, extracting concrete evidence of genuine reasoning failures that feeds directly into Opus's training pipeline. At Mercor, I recorded 200+ half-hour screencast sessions of professional software usage for computer-use model training — covering IDEs (VS Code, Xcode, PyCharm), 3D modeling tools (Blender, 3ds Max, Reality Composer), graphics software (Photoshop, Illustrator), office applications (Notes, Pages, Numbers), and advanced macOS workflows. Each session includes precise narration of every action performed with explicit reasoning for each decision, providing the step-by-step demonstration data that teaches AI agents how humans actually navigate complex software environments. I also perform AI red teaming work covering safety taxonomies, adversarial attack design across single-turn and multi-turn interactions — including jailbreaking, prompt injection, and crescendo attacks — as well as edge case labeling, bias testing prompt construction, and defensive architecture evaluation. This involves designing red team prompts that probe model vulnerabilities and systematically categorizing failure modes across safety dimensions.

IntermediateHebrewRussianEnglish

Labeling Experience

Mercor

AI Coding Agent Evaluation & RL Environments Engineer

MercorComputer Code ProgrammingRLHF
At Mindrift (Toloka AI) I design adversarial evaluation tasks for Claude Opus 4.6 within simulated company environments — complete with Python and TypeScript repositories, Jira tickets, documentation, and Slack messages. My goal is to craft prompts calibrated so the agent fails roughly half the time, then write automated end-to-end validation using the pytest framework, including system tests and AST-based code verification. After each agent run, I perform deep analysis of the diffs and transcripts, extracting concrete evidence of genuine reasoning failures that feeds directly into Opus's training pipeline.

At Mindrift (Toloka AI) I design adversarial evaluation tasks for Claude Opus 4.6 within simulated company environments — complete with Python and TypeScript repositories, Jira tickets, documentation, and Slack messages. My goal is to craft prompts calibrated so the agent fails roughly half the time, then write automated end-to-end validation using the pytest framework, including system tests and AST-based code verification. After each agent run, I perform deep analysis of the diffs and transcripts, extracting concrete evidence of genuine reasoning failures that feeds directly into Opus's training pipeline.

2026 - Present
Mindrift

Computer-Use Training Data Specialist

MindriftComputer Code ProgrammingData Collection
I recorded 200+ half-hour screencast sessions of professional software usage for computer-use model training — covering IDEs (VS Code, Xcode, PyCharm), 3D modeling tools (Blender, 3ds Max, Reality Composer), graphics software (Photoshop, Illustrator), office applications (Notes, Pages, Numbers), and advanced macOS workflows. Each session includes precise narration of every action performed with explicit reasoning for each decision, providing the step-by-step demonstration data that teaches AI agents how humans actually navigate complex software environments.

I recorded 200+ half-hour screencast sessions of professional software usage for computer-use model training — covering IDEs (VS Code, Xcode, PyCharm), 3D modeling tools (Blender, 3ds Max, Reality Composer), graphics software (Photoshop, Illustrator), office applications (Notes, Pages, Numbers), and advanced macOS workflows. Each session includes precise narration of every action performed with explicit reasoning for each decision, providing the step-by-step demonstration data that teaches AI agents how humans actually navigate complex software environments.

2026 - Present

AI Red Team & Safety Evaluation Specialist

OtherTextRed Teaming
Perform AI red teaming work covering safety taxonomies, adversarial attack design across single-turn and multi-turn interactions — including jailbreaking, prompt injection, and crescendo attacks — as well as edge case labeling, bias testing prompt construction, and defensive architecture evaluation. This involves designing red team prompts that probe model vulnerabilities and systematically categorizing failure modes across safety dimensions.

Perform AI red teaming work covering safety taxonomies, adversarial attack design across single-turn and multi-turn interactions — including jailbreaking, prompt injection, and crescendo attacks — as well as edge case labeling, bias testing prompt construction, and defensive architecture evaluation. This involves designing red team prompts that probe model vulnerabilities and systematically categorizing failure modes across safety dimensions.

2025 - Present

Education

R

Reichman University - Media Innovation Lab

Finished year-long course, Human-Computer Interaction & Augmented Reality research

Finished year-long course
2011 - 2012
R

Reichman University

Bachelor of Arts, Computer Science

Bachelor of Arts
2008 - 2011

Work History

M

Mercor

Computer-Use Training Data Specialist

N/A
2026 - Present
M

Mindrift

AI Coding Agent Evaluation & RL Environments Engineer

N/A
2026 - Present