Bioinformatician
Data Labeling & Curation: Cleaned, structured, and labeled large-scale biological datasets, ensuring consistent identifiers, metadata accuracy, and high data quality. Data Annotation: Annotated features using public reference databases (Ensembl, NCBI, g:Profiler), assigning functional, categorical, and pathway labels to raw data. Cross-Dataset Alignment: Standardized and aligned labeled features across multiple data sources to enable reliable downstream analysis and integration. Labeled Data for Modeling: Prepared and validated annotated datasets for statistical and machine-learning models, supporting accurate classification and pattern detection. Workflow Automation & Quality Control: Built reproducible pipelines to automate data preprocessing, labeling, validation, and version control across large datasets.