Data Labeling and RLHF Specialist for Mathematical Reasoning Tasks
I specialize in generating high-fidelity training data to enhance the logical rigor of Large Language Models (LLMs). My work involves providing structured Chain-of-Thought (CoT) reasoning and identifying subtle deductive fallacies in AI-generated outputs. I synthesize verifiable, step-by-step proofs to mitigate logical errors and model hallucinations. • Breakdown multi-stage mathematical proofs into sequential reasoning to improve interpretability. • Identify and reconstruct flawed reasoning steps to ensure deductive soundness. • Deliver high-granularity feedback for RLHF to align with formal mathematical standards. • Use advanced prompt engineering for CoT and logic evaluation tasks.