Clinical Feature Labeling & Data Validation – Survival Prediction System
I worked with a dataset of clinical medical records to develop a predictive system for patient survival. My role involved data preprocessing and feature labeling for 12+ clinical variables. I performed rigorous 5-fold cross-validation to ensure the "ground truth" labels were consistent across the dataset. By identifying and correcting mislabeled or outlier data points, I enabled the machine learning algorithms (Logistic Regression, KNN) to achieve an 87% accuracy rate in predicting patient outcomes.