Saamir Abbas - AI Evaluation Specialist / LLM Response Analyst

Key Skills

Software

Mercor

OneForma

Appen

Top Subject Matter

LLM response evaluation

Manufacturing domain

Conversational AI

Top Data Types

Text

Audio

Video

Document

Image

Top Task Types

Transcription

Bounding Box

Object Detection

Text Generation

Text Summarization

RLHF

Evaluation/Rating

Data Collection

Prompt + Response Writing (SFT)

Classification

Fine-tuning

Question Answering

Freelancer Overview

AI Evaluation Specialist / LLM Response Analyst. Brings 7+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Mercor, OneForma, and Turing. Education includes Bachelor of Science, Era University (2025). AI-training focus includes data types such as Text, Audio, and Video and labeling workflows including Evaluation, Rating, and Transcription.

ExpertEnglishUrdu

Labeling Experience

AI Evaluation Specialist / LLM Response Analyst

MercorText

As an AI Evaluation Specialist / LLM Response Analyst at Mercor, I performed structured evaluations of AI-generated responses using rubric-based and side-by-side (SxS) frameworks. I analyzed model outputs for reasoning, correctness, and quality across diverse real-world scenarios. My work supported improvements in AI agent performance, particularly in specialized domains such as manufacturing. • Conducted high-volume, detailed side-by-side comparison of LLM responses. • Provided structured justifications to support reliability of evaluation scores. • Identified failure patterns and proposed areas for model improvement. • Applied knowledge in domain-specific content for manufacturing use-cases.

2026 - Present

Search Engine Evaluator / LLM Evaluation Specialist

OneformaText

At OneForma, I worked as a Search Engine Evaluator and LLM Evaluation Specialist, focusing on the evaluation and ranking of AI-generated responses across multiple domains. Tasks included search relevance, query intent matching, multimodal and fact-based validation, and prompt evaluation for model improvement. I consistently applied detailed guidelines to large-scale diverse datasets. • Performed side-by-side and rubric-based response comparison across conversational and search domains. • Conducted evaluation on text, audio, and image outputs for multimodal tasks. • Handled structured content and language detection tasks such as Wikidata validation. • Ensured accuracy and consistency by following complex project instructions.

2024 - Present

Business Analyst / AI Evaluation Specialist

Text

During my tenure at Turing as a Business Analyst / AI Evaluation Specialist, I evaluated AI outputs via detailed side-by-side comparisons. I focused on identifying hallucinations, inconsistencies, and providing comprehensive justifications using structured evaluation frameworks. My work combined manual evaluation and basic technical tools to ensure accurate findings for model improvement. • Compared AI-generated outputs with gold standards for relevance and correctness. • Utilized browser, command-line, and Python tools in support of quality analysis. • Fact-checked information accuracy to mitigate hallucinations and errors. • Documented results for wider QA and development use.

2026 - 2026

AI Annotation Specialist / QA Reviewer

AudioTranscription

As an AI Annotation Specialist and QA Reviewer at RWS, I performed verbatim and golden audio transcription along with annotation of non-speech events. I evaluated multi-turn dialogues for conversational AI and conducted quality assurance reviews for transcription accuracy. The position involved designing and assessing complex workflow scenarios and contributing to bilingual dataset annotation. • Applied rubric-based frameworks for dialogue quality evaluation in English and Urdu. • Ensured guideline compliance as a promoted reviewer overseeing contributor accuracy. • Evaluated multi-turn real-world scenarios and decision chains for AI agents. • Recorded and annotated diverse audio and video datasets for training purposes.

2025 - 2026

Document Annotation Specialist

Document

As a Document Annotation Specialist at Invisible Technologies, I validated data extracted from PDFs using structured JSON schemas. This involved cross-referencing, correcting field-level discrepancies, and verifying numerical and textual consistency in complex documents. I contributed to improving extraction models through quality assurance and edge case identification. • Compared and corrected AI-extracted data across documents for accuracy. • Ensured guideline compliance for field validation and totals verification. • Identified and documented complex edge cases to facilitate model upgrades. • Maintained consistency across large, structured datasets for performance metrics.

2025 - 2025

Education

E

Era University

Bachelor of Science, Zoology

Bachelor of Science

2022 - 2025

Work History

F

Freelance

Graphic Designer / UI-UX Designer / Motion Designer

N/A

2020 - Present