For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
S

Syeda Anaa Batool

Freelance ML & AI Software Engineer — NLP AI Agent Training and Evaluation

PAKISTAN flag
N/A, Pakistan
$70.00/hrIntermediateDon T Disclose

Key Skills

Software

Don't disclose

Top Subject Matter

Natural Language Processing
Conversational AI
Large Language Models

Top Data Types

TextText
ImageImage
DocumentDocument

Top Task Types

Classification
Object Detection
Text Generation
Text Summarization
Transcription
Data Collection
Segmentation

Freelancer Overview

Freelance ML & AI Software Engineer — NLP AI Agent Training and Evaluation. Brings 10+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Internal and Proprietary Tooling. Education includes Master of Science, University of Management and Technology (2026) and Bachelor of Science, COMSATS University of Information and Technology (2017). AI-training focus includes data types such as Text and labeling workflows including Evaluation and Rating.

IntermediateEnglish

Labeling Experience

LLM Evaluation & Data Annotation — AdaptiveRAG Project

Text
Built and tested retrieval-augmented generation (RAG) frameworks for LLMs, incorporating faithfulness and hallucination evaluation. Developed and applied custom metrics (EM, F1, faithfulness, hallucination rate) to annotate and assess output quality. Produced publication-quality documentation explaining experimental setup, annotation guidelines, and evaluation criteria.• Applied label definitions for LLM answer faithfulness and hallucination • Annotated and validated responses for quality benchmarking • Iterated evaluation protocols based on model version changes • Communicated annotation outcomes for publication and review.

Built and tested retrieval-augmented generation (RAG) frameworks for LLMs, incorporating faithfulness and hallucination evaluation. Developed and applied custom metrics (EM, F1, faithfulness, hallucination rate) to annotate and assess output quality. Produced publication-quality documentation explaining experimental setup, annotation guidelines, and evaluation criteria.• Applied label definitions for LLM answer faithfulness and hallucination • Annotated and validated responses for quality benchmarking • Iterated evaluation protocols based on model version changes • Communicated annotation outcomes for publication and review.

2023 - Present

Freelance ML & AI Software Engineer — NLP AI Agent Training and Evaluation

Text
Designed and implemented custom Alexa NLP skills utilizing intent classification and entity recognition pipelines. Created, evaluated, and iterated on model outputs to refine conversational AI agent tasks. Produced technical documentation detailing model behavior, output accuracy, and failure analysis.• Trained and evaluated NLP models using real user utterance data • Analyzed agent performance to identify training signal gaps • Wrote and updated task specifications for AI behavior refinement • Documented labeling methodology for technical stakeholders.

Designed and implemented custom Alexa NLP skills utilizing intent classification and entity recognition pipelines. Created, evaluated, and iterated on model outputs to refine conversational AI agent tasks. Produced technical documentation detailing model behavior, output accuracy, and failure analysis.• Trained and evaluated NLP models using real user utterance data • Analyzed agent performance to identify training signal gaps • Wrote and updated task specifications for AI behavior refinement • Documented labeling methodology for technical stakeholders.

2020 - Present

FedNLP — Federated Learning NLP Evaluation & Data Labeling

Text
Designed federated learning NLP experiments with differential privacy, preparing and segmenting data distributions for model aggregation. Evaluated model convergence and calibration via systematic annotation of experiment outputs. Documented labeling and evaluation methodology for privacy-preserving NLP benchmarking.• Segmented text data for federated training and evaluation • Annotated experiment outcomes to assess calibration metrics • Tracked convergence through regular evaluation checkpoints • Standardized documentation for label procedures.

Designed federated learning NLP experiments with differential privacy, preparing and segmenting data distributions for model aggregation. Evaluated model convergence and calibration via systematic annotation of experiment outputs. Documented labeling and evaluation methodology for privacy-preserving NLP benchmarking.• Segmented text data for federated training and evaluation • Annotated experiment outcomes to assess calibration metrics • Tracked convergence through regular evaluation checkpoints • Standardized documentation for label procedures.

2022 - 2023

Education

U

University of Management and Technology

Master of Science, Computer Science

Master of Science
2024 - 2026
C

COMSATS University of Information and Technology

Bachelor of Science, Computer Science

Bachelor of Science
2013 - 2017

Work History

F

Freelance

AI & Technology Consultant

N/A
2020 - Present
F

Freelance

Online Programming Instructor

N/A
2018 - 2019