AI Data Generation Specialist (Freelancer)
As a Data Scientist at Remotasks.com, I designed and validated structured data annotation trajectories for Large Language Model (LLM) training. My work focused on the creation and refinement of structured text and code for Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) pipelines. I ensured accuracy, consistency, and reproducibility in AI model data through robust annotation workflows. • Developed and refined annotation guidelines and quality processes for agentic AI systems. • Created and validated test cases to assess model-generated code and logic. • Simulated annotation task executions and verified data pipeline correctness with Python, SQL, and BigQuery. • Authored comprehensive documentation on workflow assumptions, terminal-based operation, and model evaluation steps.