Image Venue Classification (Self-training, Semi-supervised Learning)
I developed an image-based venue classification system utilizing a dataset of over 15,000 RGB images from the MIT Places2 dataset. The project involved data preprocessing, classical feature engineering, and training of machine learning models to classify venues based on visual content. I implemented semi-supervised Decision Trees to leverage 80% unlabeled data using pseudo-labeling techniques. • Managed a dataset of RGB images focused on venue categories like Museum, Library, and Shopping Mall. • Engineered image features (HOG, Color Histograms, GLCM) to improve classification accuracy. • Trained Random Forest and SVM models, utilizing GridSearchCV for hyperparameter tuning. • Applied self-training and confidence-based pseudo-labeling to maximize the use of unlabeled data.