For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
K
Kirill Didenko

Kirill Didenko

External Contractor - Scientific Data Curation for LLMs

Mexico flagCancún, Mexico
$45.00/hrExpert

Key Skills

Software

No software listed

Top Subject Matter

Biomedical Domain Expertise
Biochemical Domain Expertise
Scientific Research

Top Data Types

TextText
DocumentDocument

Top Task Types

Fine-tuningFine-tuning
Entity (NER) ClassificationEntity (NER) Classification
ClassificationClassification

Freelancer Overview

External Contractor - Scientific Data Curation for LLMs. Brings 16+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Internal and Proprietary Tooling. Education includes Master of Science, Moscow State University (2006) and Doctor of Philosophy, Moscow State University (2010). AI-training focus includes data types such as Text and Document and labeling workflows including Fine-tuning, Entity (NER) Classification, and Classification.

ExpertEnglishRussian

Labeling Experience

CTO / Lead AI Architect - Multi-Agent Data Classification & Annotation (PatentRegistrar.com)

DocumentClassification
As CTO and Lead AI Architect at PatentRegistrar.com, I architected multi-agent AI pipelines for automated classification, extraction, and validation of domain-specific terminology in patent documents. My responsibilities included orchestrating agent-driven labeling workflows for document analysis and integrating authoritative data sources to enrich and validate annotated data. I developed the system's workflow logic to ensure high-precision labeling for scientific and technical documents across multiple specialized domains. • Automated extraction and annotation of terminology and scientific entities from large document corpora. • Designed classification logic for identifying and labeling patents by domain. • Integrated validation steps to ensure accuracy and completeness of annotations. • Oversaw system implementation for production-grade document AI labeling pipelines.

As CTO and Lead AI Architect at PatentRegistrar.com, I architected multi-agent AI pipelines for automated classification, extraction, and validation of domain-specific terminology in patent documents. My responsibilities included orchestrating agent-driven labeling workflows for document analysis and integrating authoritative data sources to enrich and validate annotated data. I developed the system's workflow logic to ensure high-precision labeling for scientific and technical documents across multiple specialized domains. • Automated extraction and annotation of terminology and scientific entities from large document corpora. • Designed classification logic for identifying and labeling patents by domain. • Integrated validation steps to ensure accuracy and completeness of annotations. • Oversaw system implementation for production-grade document AI labeling pipelines.

2023 - 2026

External Contractor - Scientific Data Curation for LLMs (OpenAI)

TextFine Tuning
As an external expert contractor for OpenAI, I contributed to structuring, validating, and normalizing large-scale biomedical and biochemical datasets for use in LLM pipelines. My work involved the preparation of annotated and curated scientific data to enhance the accuracy of language models in biomedical domains. I engaged in collaborative research and discussions focused on data integration and AI-driven knowledge extraction. • Participated in creating and refining biomedical data for AI training. • Ensured data quality through rigorous validation and normalization. • Worked on projects involving explicit data curation for LLMs. • Facilitated domain-specific annotation workflows for structured knowledge integration.

As an external expert contractor for OpenAI, I contributed to structuring, validating, and normalizing large-scale biomedical and biochemical datasets for use in LLM pipelines. My work involved the preparation of annotated and curated scientific data to enhance the accuracy of language models in biomedical domains. I engaged in collaborative research and discussions focused on data integration and AI-driven knowledge extraction. • Participated in creating and refining biomedical data for AI training. • Ensured data quality through rigorous validation and normalization. • Worked on projects involving explicit data curation for LLMs. • Facilitated domain-specific annotation workflows for structured knowledge integration.

2025 - 2025

Chief AI Architect - Biomedical Knowledge Extraction/Annotation (SCV Biosciences)

TextEntity Ner Classification
As Chief AI Architect at SCV Biosciences Inc., I designed and implemented an expert system for pharmaceutical R&D involving knowledge extraction and annotation of biomedical entities in scientific literature. Using NLP and BERT-based pipelines, I oversaw entity recognition, fact extraction, and validation for drug discovery workflows. I managed annotation logic and collaborated on terminological classification and consistency of biomedical datasets. • Developed workflows for extraction and labeling of compound–protein–disease relationships. • Applied expert annotation to enable target ideation and reasoning-based inference. • Validated accuracy through team and tool-based review processes. • Integrated labeled data into knowledge graphs and scientific databases for drug development.

As Chief AI Architect at SCV Biosciences Inc., I designed and implemented an expert system for pharmaceutical R&D involving knowledge extraction and annotation of biomedical entities in scientific literature. Using NLP and BERT-based pipelines, I oversaw entity recognition, fact extraction, and validation for drug discovery workflows. I managed annotation logic and collaborated on terminological classification and consistency of biomedical datasets. • Developed workflows for extraction and labeling of compound–protein–disease relationships. • Applied expert annotation to enable target ideation and reasoning-based inference. • Validated accuracy through team and tool-based review processes. • Integrated labeled data into knowledge graphs and scientific databases for drug development.

2019 - 2021

Education

B

Bauman Moscow State Technical University

Master of Science, Computer Science

Master of Science
2006 - 2011
M

Moscow State University

Doctor of Philosophy, Biochemistry

Doctor of Philosophy
2006 - 2010

Work History

P

PatentRegistrar.com

CTO / Lead AI Architect

Cancún
2023 - 2026
O

OpenAI

External Contractor

N/A
2025 - 2025