Samuel Mweni - Research Assistant (Data Annotation)

Key Skills

Software

Doccano

Clickworker

Data Annotation Tech

Labelbox

Label Studio

Mercor

Mindrift

OneForma

Remotasks

Scale AI

Snorkel AI

Toloka

Telus

Appen

Top Subject Matter

AI & Data Science – NLP, Text Annotation & Model Evaluation

Finance – Credit Scoring, Risk Modeling & Financial Data Analysis

AI Product Development – Machine Learning Systems, Automation & Applied AI Solutions

Top Data Types

Text

Image

Document

Top Task Types

Entity Ner Classification

Text Generation

RLHF

Text Summarization

Fine Tuning

Transcription

Computer Programming Coding

Data Collection

Question Answering

Object Detection

Evaluation Rating

Prompt Response Writing SFT

Function Calling

Classification

Freelancer Overview

Experienced in AI data annotation and evaluation with a strong background in data science and NLP. Skilled in tools such as Doccano and proprietary annotation systems to create high-quality datasets for training and evaluating machine learning models. Holds a Master of Science in Data Science from American University (2024) and a Bachelor of Science in Statistics and Computer Science from Technical University of Mombasa (2019). Expertise includes text annotation, Named Entity Recognition (NER), classification, evaluation, and rating workflows.

ExpertEnglish

Labeling Experience

Data Science AI Trainer

Text

As a Data Science AI Trainer at Outlier, I focused on evaluating the accuracy and consistency of large language model responses. I developed high-quality prompts for AI model training and fine-tuning tasks. This work helped to improve LLM performance and reliability for downstream applications. • Evaluated large language model (LLM) outputs for consistency and quality. • Designed and implemented high-quality prompts for supervised fine-tuning. • Provided ratings and feedback to guide future model improvements. • Contributed to the enhancement of LLM safety, fairness, and effectiveness.

2023 - Present

Research Assistant (Data Annotation)

DoccanoTextEntity Ner Classification

As a Research Assistant (Data Annotation) at the National Institutes of Health, I performed text-based annotations for biomedical NLP applications. I researched and evaluated annotation tools tailored for medical datasets, ensuring accurate labeling for AI model training. My efforts contributed to the advancement of biomedical natural language processing systems. • Annotated over 1,000 medical datasets for AI training purposes. • Evaluated and compared multiple annotation tools suitable for biomedical text. • Collaborated with researchers to maintain labeling consistency and quality. • Enhanced labeled datasets that underpin model development and validation.

2024 - 2024

Education

A

American University, College of Arts and Science

Master of Science, Data Science

Master of Science

2022 - 2024

T

Technical University of Mombasa

Bachelor of Science, Statistics and Computer Science

Bachelor of Science

2015 - 2019

Work History

K

KesiTrack

Project Manager

Nairobi

2024 - Present

P

Paya Finance

Co-Founder and Chief Executive Officer

Nairobi

2023 - Present