Ahmed Muhammad

Ai Evaluator, Data Annotator

Giza, Egypt

$7.00/hrIntermediateCVATData Annotation TechLabel Studio

Key Skills

Software

CVAT

Data Annotation Tech

Label Studio

Scale AI

Top Subject Matter

No subject matter listed

Top Data Types

Computer Code Programming

Text

Video

Top Task Types

Bounding Box

Computer Programming Coding

Evaluation Rating

Function Calling

Prompt Response Writing SFT

Freelancer Overview

I’m an AI Evaluation & Data Annotation Specialist with hands-on experience shaping the performance, safety, and reliability of large language models (LLMs) and computer vision (CV) systems. Over the past several years, I’ve contributed to cutting-edge projects in: NLP & Chatbot Alignment – evaluating multi-persona chatbots, ensuring tone consistency, refining responses, and conducting safety tests to prevent harmful or illegal outputs. Prompt Engineering & Compliance – designing and testing system prompts, validating multi-turn compliance, and improving behavioural reliability. Persona & Memory Testing – assessing long-form conversations to ensure models respect assigned personas and consistently integrate user-specific memories. Mathematical Reasoning – creating and reviewing educational content, from grade school to university-level maths, ensuring truthfulness, clarity, and cultural/linguistic alignment. Computer Vision – testing models with adversarial cases, refining outputs for tiny details, and building high-quality annotated datasets (bounding boxes, polygons, segmentation) with CVAT and Label Studio. What ties my work together is a commitment to making AI systems safer, more accurate, and more aligned with human needs. Whether it’s strengthening RLHF pipelines, improving dataset quality, or fine-tuning system prompts, I bring a mix of technical rigour, bilingual expertise (Arabic & English), and a problem-solving mindset.

IntermediateArabicEnglish

Labeling Experience

Egyptian chatbot training

Data Annotation TechTextFine Tuning

Fine tuning Egyptian Arabic chatbot and correct the response if it does not align with project rules.

2024

Code Eval project

Scale AIComputer Code ProgrammingEvaluation Rating

Evaluating the response of the generated code-related responses and prompts. Correct the model response if does not align with the rules of the project.

2024

Education

Helwan University

Bachelor in law and economics, Law and economics

Bachelor in law and economics

2007 - 2011

Work History

Outlier

AI Evaluation and Alignment Specialist

Remote

2024 - Present