For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
D

Dfinitysz

Data Labeling and RLHF Specialist for Mathematical Reasoning Tasks

Malaysia flagMalaysia
$50.00/hrEntry LevelOther

Key Skills

Software

Other

Top Subject Matter

Mathematics Domain Expertise
Formal Logic
AI Alignment

Top Data Types

TextText

Top Task Types

RLHF

Freelancer Overview

Data Labeling and RLHF Specialist for Mathematical Reasoning Tasks. Core strengths include Other. Education includes Master of Science, Tsinghua University. AI-training focus includes data types such as Text and labeling workflows including RLHF.

Entry LevelEnglish

Labeling Experience

Data Labeling and RLHF Specialist for Mathematical Reasoning Tasks

OtherTextRLHF
I specialize in generating high-fidelity training data to enhance the logical rigor of Large Language Models (LLMs). My work involves providing structured Chain-of-Thought (CoT) reasoning and identifying subtle deductive fallacies in AI-generated outputs. I synthesize verifiable, step-by-step proofs to mitigate logical errors and model hallucinations. • Breakdown multi-stage mathematical proofs into sequential reasoning to improve interpretability. • Identify and reconstruct flawed reasoning steps to ensure deductive soundness. • Deliver high-granularity feedback for RLHF to align with formal mathematical standards. • Use advanced prompt engineering for CoT and logic evaluation tasks.

I specialize in generating high-fidelity training data to enhance the logical rigor of Large Language Models (LLMs). My work involves providing structured Chain-of-Thought (CoT) reasoning and identifying subtle deductive fallacies in AI-generated outputs. I synthesize verifiable, step-by-step proofs to mitigate logical errors and model hallucinations. • Breakdown multi-stage mathematical proofs into sequential reasoning to improve interpretability. • Identify and reconstruct flawed reasoning steps to ensure deductive soundness. • Deliver high-granularity feedback for RLHF to align with formal mathematical standards. • Use advanced prompt engineering for CoT and logic evaluation tasks.

Present

Education

T

Tsinghua University

Master of Science, Mathematics

Master of Science
Not specified

Work History

N

none

none

guangzhou
2026 - Present