Mordecai Machazire - AI Safety Researcher - Protocol-Governed Systems

Key Skills

Software

Other

Mercor

Google Cloud Vertex AI

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Computer Code Programming

Document

Image

Medical Dicom

Text

Video

Top Label Types

Action Recognition

Audio Recording

Classification

Evaluation Rating

Object Detection

Prompt Response Writing SFT

Question Answering

Text Generation

Text Summarization

Transcription

Freelancer Overview

I specialize in designing and implementing protocol-driven evaluation systems that ensure AI models are trained, labeled, and audited with maximum reliability and transparency. My experience spans building rubric-based pipelines for clinical data (MIMIC-III ICU), where I decomposed complex decisions into binary, verifiable labeling criteria, and developing the RAG-URL Protocol, which enforces auditable, step-by-step reasoning for LLMs. I am skilled in Python, SQL, and tools like pandas, NumPy, and SPSS, and have led projects applying AI-assisted summarization, translation, and structured QA/QC for research and healthcare domains. My focus is on creating data labeling and annotation workflows that are not only accurate but also interpretable and scalable for high-stakes applications.

ExpertEnglish

Labeling Experience

Model Validation Expert (MOVE Fellow)

OtherTextEvaluation Rating

As a Model Validation Expert (MOVE Fellow) at Handshake AI, I evaluate large language model (LLM) responses using structured rubric-based frameworks. I conduct RLHF training data generation through preference ranking and response comparison with predefined evaluation criteria. I focus on health science and medical domain QA for subject-matter accuracy. • Evaluate LLM response quality, factual accuracy, and reasoning coherence using structured rubrics • Generate RLHF training data via preference ranking and comparative response evaluation • Apply medical and scientific expertise in LLM evaluation tasks • Utilize Handshake AI Platform for data annotation and assessment

2025

Lead Researcher — Health Protocol Auditor

Google Cloud Vertex AIMedical DicomEvaluation Rating

I managed development of ground-truth evaluation pipelines for clinical AI, using the MIMIC-III ICU database. These pipelines established rubric-based, mathematically verifiable checks for health AI workload correctness. My work supported reference dataset creation for training, quality control, and external audits. • Applied Cockcroft-Gault formulae, clinical statistics, and QA protocols • Structured canonical outputs for clinical AI model assessment • Advanced reproducibility and auditability in healthcare datasets • Provided high-validity labels for medical AI pipeline training

2023

9-Step Hypothesis Testing Rubric Designer

Google Cloud Vertex AITextEvaluation Rating

I created a structured evaluation protocol for clinical hypothesis testing performed by LLMs. The framework defines nine binary, independently verifiable criteria for auditability and accuracy. It supports external auditing and improvement of AI-assisted statistical analysis workflows. • Rubric includes research question clarity, hypothesis, statistical assumptions, and interpretation • Pass/fail conditions for each step ensure reproducible human and AI evaluation • Enables stepwise quality assurance in health science AI tasks • Designed for integration into LLM assessment workflows

2023

Mercor Rubric Academy Capstone Project

MercorTextEvaluation Rating

I completed a structured evaluation capstone as part of Mercor Rubric Academy certification, focused on clinical/biostatistical LLM outputs. The project entailed rubric creation, binary criterion scoring, and performance analysis for Claude and ChatGPT. Results established a reproducible and mathematically sound assessment framework. • Developed 7-criterion binary rubric for eGFR, hypothesis testing, and statistical LLM tasks • Set explicit scoring logic, rounding/tolerance parameters, and criterion dependencies • Demonstrated evolution of rubric standards via correction of expected values • Provided precise, reproducible scoring for AI model outputs in the health domain

2026 - 2026

Data Annotation Contractor

MercorTextClassification

As a Data Annotation Contractor for Mercor, I contributed to multiple projects covering visual, multimodal, and text annotation tasks. I implemented domain-specific rubrics to produce high-quality labeled datasets across diverse subject matter. My work enabled accurate supervised training and validation in AI model development. • Labeled match events, player actions, and formations for sports AI using structured taxonomies • Performed image classification and object detection with predefined annotation rubrics • Evaluated LLM outputs for quality and factual accuracy, contributing to SFT datasets • Executed multimodal annotation including image captioning and VQA on Multimango Platform

2025 - 2026

Education

U

University of Saskatchewan

Master of Public Health, Public Health

Master of Public Health

2021 - 2024

S

St George's, University of London

Human Life Sciences, Medicine

Human Life Sciences

2017 - 2023

Work History

P

Project Hamburg Research, Inc

Health Protocol Auditor

Buffalo

2023 - 2025

P

Project Hamburg LLC

Lead Researcher

Chicago

2020 - 2025