Muhammad Sulleman - Founder & Lead Engineer — Data Labeling, QA & LLM Output Evaluation, MetaOrcha

Key Skills

Software

AWS SageMaker

HiveMind

Mercor

Remotasks

Top Subject Matter

LLM agentic orchestration

behavioral testing

output evaluation

Top Data Types

Text

Image

Document

Top Task Types

Classification

Question Answering

Text Summarization

RLHF

Fine-tuning

Red Teaming

Transcription

Evaluation/Rating

Computer Programming/Coding

Function Calling

Prompt + Response Writing (SFT)

Text Generation

Freelancer Overview

Founder & Lead Engineer — Data Labeling, QA & LLM Output Evaluation, MetaOrcha. Brings 8+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Internal and Proprietary Tooling. Education includes Bachelor of Engineering, National University of Sciences and Technology (2022). AI-training focus includes data types such as Text and labeling workflows including Evaluation, Rating, and Classification.

ExpertEnglishRussianUrdu

Labeling Experience

Founder & Lead Engineer — Data Labeling, QA & LLM Output Evaluation, MetaOrcha

Text

Led systematic AI failure-mode analysis and behavioral validation for agent outputs on a production-grade LLM orchestration platform. Developed and enforced labelling-equivalent guidelines, correctness criteria, and ambiguity escalation paths for output review. Benchmarked functional equivalence across LLM providers and stress-tested third-party agent network outputs for quality benchmarking. • Defined annotation-style acceptance criteria for agentic outputs. • Crafted adversarial test suites targeting model behaviors and failure cases. • Built processes for edge-case enumeration, inter-annotator agreement, and output ambiguity handling. • Used internal/proprietary tooling with evaluation rubrics for ongoing quality assurance.

2025 - Present

Co-founder & CTO — Annotation-Centric QA & Expected-vs-Actual Output Labeling, Crashx

Text

Created and implemented annotation-style acceptance criteria for education-tech platform features, focusing on defining expected, passing, or regression outputs. Generated labeled logs of expected-vs-actual outputs for student responses under adversarial exam conditions, ensuring reproducibility and ongoing QA. Ensured annotation guidelines were clearly communicated and scalable to 120+ student user base. • Designed reproducible annotation guides for student-facing feature QA. • Produced labeled datasets capturing edge cases in educational assessment. • Established criteria to escalate ambiguous labeling outcomes. • Leveraged internal/proprietary tools for log labeling and QA review.

2024 - Present

Engineer — Ground-Truth Label Definition & Decision Guideline Creation, WebPuls.ai

TextClassification

Defined and documented ground-truth labels to distinguish 'meaningful vs. spurious change' in web page content for production content-detection logic. Tuned label guidelines for precision and recall trade-offs and handled edge-case decision criteria at a large scale. Developed quality-gating and noise-filtering measures mirrored from data labeling best practices. • Created robust labeling rubrics for change classification in noisy data. • Scaled annotation processes for thousands of web pages. • Documented and updated decision guidelines for ambiguous content shifts. • Used internal/proprietary detection logic and evaluation tools.

2024 - 2024

Intern — Validation Test Annotation & Edge-Case Labeling, PQC Labs

Text

Developed and executed validation tests to ensure protocol correctness, entropy preservation, and adversarial recovery in cryptographic data scenarios. Provided systematic edge-case coverage directly applicable to AI training data validation. Engaged in detailed error scenario labeling, ambiguity resolution, and comprehensive review of protocol outcomes. • Validated and labeled protocol outputs in adversarial settings. • Created annotation criteria for entropy and recovery benchmarks. • Conducted edge-case enumeration and protocol ambiguity handling. • Relied on internal/proprietary test and validation frameworks.

2023 - 2023

Education

N

National University of Sciences and Technology

Bachelor of Engineering, Software Engineering

Bachelor of Engineering

2022

Work History

M

MetaOrcha

Founder & Lead Engineer

Islamabad

2025 - Present

C

Crashx

Co-founder & CTO

Islamabad

2024 - Present