Freelancer Overview
I have extensive experience in creating, curating, and evaluating high-quality training data for AI systems, particularly in scientific reasoning, computational biology, and code generation. As an AI trainer and biology subject-matter expert, I have contributed to multiple large-scale projects involving RLHF, SFT, HLE, and Sci-CODE, where I analyzed model outputs, designed multi-step scientific prompts, corrected reasoning chains, and ensured factual, domain-accurate annotations. My expertise spans genomics, genome mining, natural product discovery, and bioinformatics pipelines, enabling me to bring deep domain knowledge to complex annotation tasks.
Alongside biological expertise, I work routinely with Python, scientific libraries, and real-world datasets from NCBI, Ensembl, KEGG, and antiSMASH, which strengthens my ability to design and validate technical prompts, coding challenges, and structured datasets. I am highly detail-oriented, consistent in labeling, and experienced in breaking down complex scientific problems into clear, high-quality training data that improves model performance and reliability.