For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
M

Mikiyas Mohammed

Senior ML Infrastructure Engineer – RLHF & Human-in-the-Loop Platform Architect

ETHIOPIA flag
Addis Ababa, Ethiopia
$15.00/hrExpertInternal Proprietary ToolingOtherMercor

Key Skills

Software

Internal/Proprietary Tooling
Other
MercorMercor
Micro1

Top Subject Matter

Agentic AI
LLM Evaluation
Human-in-the-Loop AI Training

Top Data Types

ImageImage
TextText
DocumentDocument

Top Task Types

RLHF
Function Calling
Prompt Response Writing SFT
Fine Tuning
Object Detection
Evaluation Rating
Computer Programming Coding

Freelancer Overview

Senior ML Infrastructure Engineer – RLHF & Human-in-the-Loop Platform Architect. Brings 11+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Science, Addis Ababa Institute of Technology. AI-training focus includes data types such as Computer Code and Programming and labeling workflows including RLHF.

ExpertEnglish

Labeling Experience

OSWorld Multimodal Agents

ImageRLHF
Trained multimodal AI agents within the OSWorld framework to execute complex, open-ended computer tasks autonomously. The scope of the project involved assessing the model's ability to natively interact with operating systems and desktop applications. Specific tasks included evaluating GUI navigation logic, analyzing action sequences for accuracy, and providing human feedback on the agent's desktop automation strategies. Ensured high-quality training data by rigorously scoring task completion success and correcting spatial reasoning or tool-use errors within the UI environment.

Trained multimodal AI agents within the OSWorld framework to execute complex, open-ended computer tasks autonomously. The scope of the project involved assessing the model's ability to natively interact with operating systems and desktop applications. Specific tasks included evaluating GUI navigation logic, analyzing action sequences for accuracy, and providing human feedback on the agent's desktop automation strategies. Ensured high-quality training data by rigorously scoring task completion success and correcting spatial reasoning or tool-use errors within the UI environment.

2025 - 2026

Senior ML Infrastructure Engineer – RLHF & Human-in-the-Loop Platform Architect

RLHF
As Senior ML Infrastructure Engineer at Turing, I architected and scaled human-in-the-loop RLHF and SFT workflows to support AI model training across 15+ applications. I designed and maintained distributed evaluation platforms for parallel agent behavior validation and orchestrated annotation systems involving over 500 global annotators. My responsibilities included building infrastructure for large-scale data annotation, pipeline optimization, and human-centered model evaluation. • Designed RLHF and SFT platforms enabling model feedback collection by human annotators • Integrated evaluation modules executing parallel Pass@k benchmarks on model outputs • Developed real-time observability and human-in-the-loop verification of AI agent actions • Built annotation data streaming and debugging pipelines using internal/proprietary tooling

As Senior ML Infrastructure Engineer at Turing, I architected and scaled human-in-the-loop RLHF and SFT workflows to support AI model training across 15+ applications. I designed and maintained distributed evaluation platforms for parallel agent behavior validation and orchestrated annotation systems involving over 500 global annotators. My responsibilities included building infrastructure for large-scale data annotation, pipeline optimization, and human-centered model evaluation. • Designed RLHF and SFT platforms enabling model feedback collection by human annotators • Integrated evaluation modules executing parallel Pass@k benchmarks on model outputs • Developed real-time observability and human-in-the-loop verification of AI agent actions • Built annotation data streaming and debugging pipelines using internal/proprietary tooling

2022 - 2026

LLM Trainer - Agent Function call

Computer Code ProgrammingFunction Calling
As an LLM Trainer for Agentic Function Calling at Turing, I collaborate with frontier AI labs to advance the reasoning and tool-use capabilities of foundational Large Language Models. My work involves architecting high-fidelity, multi-turn synthetic datasets that simulate complex interactions between users and AI assistants across various API ecosystems, including email, calendar, and cloud storage. By playing the dual role of user and assistant, I design intricate "agentic" workflows that require the model to perform logical tool selection, generate precise JSON-formatted function calls, and navigate real-world constraints. I focus on enhancing model reliability by crafting scenarios that test contextual understanding, task feasibility, and the ability to maintain conversational flow when external tools are not required. Through rigorous iteration and adherence to technical playbooks, I provide the high-quality proprietary data necessary for fine-tuning models to handle mission-critical, autonomous tasks with human-like precision.

As an LLM Trainer for Agentic Function Calling at Turing, I collaborate with frontier AI labs to advance the reasoning and tool-use capabilities of foundational Large Language Models. My work involves architecting high-fidelity, multi-turn synthetic datasets that simulate complex interactions between users and AI assistants across various API ecosystems, including email, calendar, and cloud storage. By playing the dual role of user and assistant, I design intricate "agentic" workflows that require the model to perform logical tool selection, generate precise JSON-formatted function calls, and navigate real-world constraints. I focus on enhancing model reliability by crafting scenarios that test contextual understanding, task feasibility, and the ability to maintain conversational flow when external tools are not required. Through rigorous iteration and adherence to technical playbooks, I provide the high-quality proprietary data necessary for fine-tuning models to handle mission-critical, autonomous tasks with human-like precision.

2025 - 2025

Code Interpreter & Supervised Fine-Tuning (SFT)

Computer Code ProgrammingPrompt Response Writing SFT
Authored high-fidelity Supervised Fine-Tuning (SFT) datasets to enhance the code generation and code interpreter capabilities of foundational AI models. The project scope covered complex, cross-disciplinary problem-solving across Data Science, Computational Physics, and Applied Mathematics. Tasks included writing optimal code solutions, generating step-by-step reasoning prompts, and validating algorithm efficiency. Maintained strict quality control by ensuring all code compiled perfectly, met rigorous big-O complexity constraints, and adhered to advanced mathematical and scientific principles.

Authored high-fidelity Supervised Fine-Tuning (SFT) datasets to enhance the code generation and code interpreter capabilities of foundational AI models. The project scope covered complex, cross-disciplinary problem-solving across Data Science, Computational Physics, and Applied Mathematics. Tasks included writing optimal code solutions, generating step-by-step reasoning prompts, and validating algorithm efficiency. Maintained strict quality control by ensuring all code compiled perfectly, met rigorous big-O complexity constraints, and adhered to advanced mathematical and scientific principles.

2024 - 2024

Software Engineer - Algorithmic RLHF & RLMF

Computer Code ProgrammingRLHF
Conducted Reinforcement Learning from Human Feedback (RLHF) on elite-level Data Structures and Algorithms. Evaluated and optimized Large Language Model code generation for 3200-rated Codeforces problems. Labeling tasks involved deep algorithmic analysis, stress-testing edge cases, and ranking model responses based on execution efficiency. Quality control was measured by flawless algorithmic execution, strict adherence to optimal time/space complexity (Big-O), and ensuring the model bypassed common competitive programming pitfalls.

Conducted Reinforcement Learning from Human Feedback (RLHF) on elite-level Data Structures and Algorithms. Evaluated and optimized Large Language Model code generation for 3200-rated Codeforces problems. Labeling tasks involved deep algorithmic analysis, stress-testing edge cases, and ranking model responses based on execution efficiency. Quality control was measured by flawless algorithmic execution, strict adherence to optimal time/space complexity (Big-O), and ensuring the model bypassed common competitive programming pitfalls.

2023 - 2024

Education

A

Addis Ababa Institute of Technology

Bachelor of Science, Computer Engineering

Bachelor of Science
Not specified

Work History

T

Turing

Senior ML Infrastructure Engineer

Palo Alto
2022 - Present
H

HakimHub

Backend Developer

Addis Ababa
2019 - 2022