Malware Classification & Opcode Data Annotation
Worked on annotating and structuring opcode-based datasets for malware classification. Assigned labels to different malware families and validated dataset consistency to ensure accurate model training. Performed preprocessing on raw opcode sequences and removed noisy or redundant patterns. Applied N-gram feature extraction techniques and trained machine learning models for classification. Used Explainable AI (SHAP) to analyze predictions and identify potential labeling errors or inconsistencies. This project strengthened my ability to handle complex structured datasets and ensure high-quality annotations for multi-class problems.