Rimsha Saeed - Senior Data Analyst – SFT & Data Labeling

Key Skills

Software

Google Cloud Vertex AI

HiveMind

Img Lab

Labelbox

LabelImg

Mercor

OpenCV AI Kit (OAK)

Scale AI

Telus

Top Subject Matter

Healthcare - Data Analysis and Segmentation

Business & Finance - Risk Analysis & Dashboards

E-commerce - Analytics and Fraud Detection

Top Data Types

Text

Image

Document

Top Task Types

Fine Tuning

Classification

Object Detection

Prompt Response Writing SFT

Transcription

Text Summarization

Segmentation

Question Answering

RLHF

Evaluation Rating

Computer Programming Coding

Data Collection

Freelancer Overview

I have extensive experience in AI training data and data labeling through my work with large-scale language model projects at Turing, while contributing to Gemini and Claude trainings. I curated and structured over 3,000 high-quality training queries across domains such as mathematics, science, physics, and statistics, ensuring diverse and robust datasets for model validation. My work involved supervised fine-tuning (SFT), generating detailed error reports, and identifying patterns in model outputs to continuously improve accuracy and performance. In addition, I led a team of junior data scientists, developing algorithms to streamline data processing and enhance reporting accuracy while integrating new data sources for ongoing model improvement. I also contributed to improving data visualization using Altair and conducted rigorous experimentation through command-line interfaces to refine model behavior. My combination of hands-on labeling expertise, analytical problem-solving, and leadership in AI training workflows sets me apart in delivering high-quality, scalable training data solutions.

ExpertGermanUrduPunjabiEnglish

Labeling Experience

Senior Data Analyst – SFT & Data Labeling

TextRLHF

The main responsibility was handling Supervised Fine-Tuning (SFT) tasks to improve model performance on various data analysis queries. This involved creating, organizing, and validating training examples to align outputs with target responses for better model accuracy. Reporting and error analysis were also key parts of the process to ensure continuous model quality improvement. • Curated datasets for SFT tasks. • Improved training data structure and relevance. • Conducted error reporting and feedback cycles. • Supported model evaluation through labeled data.

2024 - 2025

Data Scientist – Curating and Labeling Prompts for LLM Validation

TextFine Tuning

This role centered on curating diverse user queries from multiple domains to assist in the validation and evaluation of a language model. Responsibilities included generating, reviewing, and organizing validation prompts that would assess model comprehension, coverage, and robustness. Tasks also included the creation and curation of labeled testing datasets for the purposes of robustness and fairness analysis. • Generated and selected diverse prompts for evaluation. • Labeled queries to facilitate structured validation tasks. • Ensured coverage of math, science, and stats concepts. • Participated in designing experiments using labeled data.

2023 - 2024

Education

N

National University of Sciences and Technology

Master of Science, Computational Science and Engineering

Master of Science

2021 - 2023

U

University of Engineering & Technology

Bachelor of Science, Electrical Engineering

Bachelor of Science

2017 - 2021

Work History

R

Remote

Research Engineer

Remote

2024 - Present

T

Turing

Senior Data Scientist

Remote

2024 - 2025