For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
Sisi Wang

Sisi Wang

AI Data Analyst - Large Language Models

China flagHangzhou , China
$15.00/hrIntermediateInternal Proprietary Tooling

Key Skills

Software

Internal/Proprietary Tooling

Top Subject Matter

No subject matter listed

Top Data Types

TextText

Top Task Types

Text Generation
RLHF
Computer Programming Coding
Data Collection
Prompt Response Writing SFT

Freelancer Overview

I am an AI data analyst with hands-on experience in building and optimizing high-quality datasets for training and fine-tuning large language models across domains such as finance, technology, and automotive safety. My work includes designing and implementing end-to-end data collection, cleaning, and annotation pipelines, as well as developing strict annotation guidelines that improved inter-annotator agreement and overall data consistency. I have led the construction of domain-specific datasets that significantly reduced model error rates, and I am skilled in prompt engineering, LLM evaluation, and automated metric analysis for NLP tasks. My technical toolkit includes Python, SQL, R, and platforms like AWS and Google Cloud, along with visualization tools such as Tableau and Power BI. I am passionate about using data-driven approaches to enhance AI model performance, ensure fairness, and deliver actionable insights for product and algorithm development.

IntermediateKoreanEnglishChinese Mandarin

Labeling Experience

Turing / Talents AI / Labelness

Internal Proprietary ToolingTextText GenerationRLHF
Led the development of specialized datasets for training and fine-tuning large language models (LLMs) across various domains, including finance and technology. Designed and implemented data collection, cleaning, and annotation pipelines to ensure data quality and consistency. Collaborated with cross-functional teams to define data requirements and quality standards. Developed and enforced strict annotation guidelines, resulting in a 30% improvement in interannotator agreement. Designed and executed comprehensive evaluation protocols to assess the performance, safety, and fairness of various LLMs. Developed a suite of automated evaluation scripts to measure metrics such as accuracy, coherence, and bias. Engineered and optimized prompts to enhance model performance on a wide range of NLP tasks, including text summarization, question answering, and code generation. Curated a library of over 500 effective prompt templates for internal use.

Led the development of specialized datasets for training and fine-tuning large language models (LLMs) across various domains, including finance and technology. Designed and implemented data collection, cleaning, and annotation pipelines to ensure data quality and consistency. Collaborated with cross-functional teams to define data requirements and quality standards. Developed and enforced strict annotation guidelines, resulting in a 30% improvement in interannotator agreement. Designed and executed comprehensive evaluation protocols to assess the performance, safety, and fairness of various LLMs. Developed a suite of automated evaluation scripts to measure metrics such as accuracy, coherence, and bias. Engineered and optimized prompts to enhance model performance on a wide range of NLP tasks, including text summarization, question answering, and code generation. Curated a library of over 500 effective prompt templates for internal use.

2025

Education

U

University of Missouri-Kansas City

Doctor of Philosophy, Computer Science

Doctor of Philosophy
2026 - 2031
M

Monash University

Master of Artificial Intelligence, Artificial Intelligence

Master of Artificial Intelligence
2023 - 2025

Work History

X

XPeng Motors

Data Analyst Intern

Shanghai
2025 - 2025
N

NIO

Data Analyst

Shanghai
2021 - 2023