Analysis of Cybersecurity Data with Random Forest
A scalable intrusion detection system by running a machine learning framework over Apache Spark for processing the large-scale UWF-ZeekData22 cybersecurity dataset. Our system used Principal Component Analysis (PCA) to reduce feature dimensions and employed a Random Forest to detect different types of cyber intrusions aligned with the MITRE ATT&CK framework.