Improving Lateral-Movement Intrusion Detection in Virtualized Networks using SHAP Feature Selection, SMOTE, and a Voting Ensemble Classifier

Authors

  • Avin Maulana Mathematics Department, Brawijaya University, Indonesia
  • Syaiful Anam Mathematics Department, Brawijaya University, Indonesia
  • Hilmi Aziz Bukhori Mathematics Department, Brawijaya University, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2025.6.4.5233

Keywords:

CICIDS2017, Class Imbalance, Intrusion Detection, Lateral Movement, SMOTE, Voting Ensemble

Abstract

Modern virtualized networks, such as those using VXLAN (Virtual eXtensible LAN), generate heavy east–west traffic, which can conceal the lateral movement of attackers. Detecting such infiltration attacks is challenging due to overlay encapsulation (e.g., VXLAN) and flat subnet architectures create blind spots for traditional IDS.  This study aims to evaluate a robust methodology for addressing class imbalance in intrusion detection by integrating SHAP-driven feature selection with SMOTE in a voting ensemble. We conducted an ablation study on the CICIDS2017 Thursday-WorkingHours-Afternoon-Infiltration subset, which is highly imbalanced (36 infiltration flows vs. 288,566 benign flows), varying SHAP feature sets (Top-5 vs. Top-30), classification thresholds , and SMOTE (Synthetic Minority Over-sampling Technique) balancing. The ensemble combined XGBoost, Random Forest, and Logistic Regression, and was evaluated with ROC-AUC, precision, recall, and F1-score. Results indicate that using more SHAP‑important features improves ROC‑AUC and recall, while SMOTE substantially enhances minority‑class detection. The best configuration is Top‑30 SHAP features with SMOTE at , achieved ROC‑AUC = 0.976 and F1‑score = 0.78, whereas using fewer features or omitting SMOTE significantly reduced recall and F1‑score. This synergy of interpretable feature selection and synthetic oversampling establishes a practical methodology for intrusion detection in highly imbalanced, modern virtualized environments. The novelty lies in demonstrating that SHAP + SMOTE integration yields both transparency and resilience, directly addressing encapsulation challenges in detecting stealthy lateral movement.

Downloads

Download data is not yet available.

References

M. Elmadani and S. O. Sati, “Data Center Lab Using VxLAN Data Plane and BGP-EVPN Control Plane,” in 2023 4th International Conference on Data Analytics for Business and Industry, ICDABI 2023, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 354–358. doi: 10.1109/ICDABI60145.2023.10629438.

D. Li, Z. Yang, S. Yu, M. Duan, and S. Yang, “A Micro-Segmentation Method Based on VLAN-VxLAN Mapping Technology,” Future Internet, vol. 16, no. 9, Sep. 2024, doi: 10.3390/fi16090320.

D. Kushwaha et al., “Lateral Movement Detection Using User Behavioral Analysis,” Aug. 2022, [Online]. Available: http://arxiv.org/abs/2208.13524

C. Smiliotopoulos, G. Kambourakis, and C. Kolias, “Detecting lateral movement: A systematic survey,” Heliyon, vol. 10, no. 4, Feb. 2024, doi: 10.1016/j.heliyon.2024.e26317.

U. Ahmed et al., “Explainable AI-based innovative hybrid ensemble model for intrusion detection,” Journal of Cloud Computing, vol. 13, no. 1, Dec. 2024, doi: 10.1186/s13677-024-00712-x.

V. Shanmugam, R. Razavi-Far, and E. Hallaji, “Addressing Class Imbalance in Intrusion Detection: A Comprehensive Evaluation of Machine Learning Approaches,” Electronics (Switzerland), vol. 14, no. 1, Jan. 2025, doi: 10.3390/electronics14010069.

A. H. Farooqi, S. Akhtar, H. Rahman, T. Sadiq, and W. Abbass, “Enhancing Network Intrusion Detection Using an Ensemble Voting Classifier for Internet of Things,” Sensors, vol. 24, no. 1, Jan. 2024, doi: 10.3390/s24010127.

I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in ICISSP 2018 - Proceedings of the 4th International Conference on Information Systems Security and Privacy, SciTePress, 2018, pp. 108–116. doi: 10.5220/0006639801080116.

Z. I. Khan, M. M. Afzal, and K. N. Shamsi, “A Comprehensive Study on CIC-IDS2017 Dataset for Intrusion Detection Systems,” 2024. [Online]. Available: https://irjaeh.com

S. Borah and R. Panigrahi, “A detailed analysis of CICIDS2017 dataset for designing Intrusion Detection Systems,” 2018. [Online]. Available: https://www.researchgate.net/publication/329045441

M. S. Towhid, N. S. Khan, M. Hasan, and N. Shahriar, “Towards Effective Network Intrusion Detection in Imbalanced Datasets: A Hierarchical Approach.”

Q. Xu et al., “SHAP-based Interpretable Models for Credit Default Assessment Using Machine Learning,” in Proceedings - 2024 14th International Conference on Software Technology and Engineering, ICSTE 2024, Institute of Electrical and Electronics Engineers Inc., 2024, pp. 213–217. doi: 10.1109/ICSTE63875.2024.00044.

Ms. M. M. Kedar, “Exploring the Effectiveness of SHAP over other Explainable AI Methods,” INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT, vol. 08, no. 06, pp. 1–5, Jun. 2024, doi: 10.55041/IJSREM35556.

U. Ahmed et al., “Hybrid bagging and boosting with SHAP based feature selection for enhanced predictive modeling in intrusion detection systems,” Sci Rep, vol. 14, no. 1, Dec. 2024, doi: 10.1038/s41598-024-81151-1.

D. Spiekermann, T. Eggendorfer, and J. Keller, “Deep Learning for Network Intrusion Detection in Virtual Networks,” Electronics (Switzerland), vol. 13, no. 18, Sep. 2024, doi: 10.3390/electronics13183617.

L. H. Li, R. Ahmad, R. Tanone, and A. K. Sharma, “STB: synthetic minority oversampling technique for tree-boosting models for imbalanced datasets of intrusion detection systems,” PeerJ Comput Sci, vol. 9, 2023, doi: 10.7717/peerj-cs.1580.

V. Surya and M. M. Selvam, “An Effective Machine Learning Approach for loT Intrusion Detection System based on SMOTE,” in 6th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2022 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 905–911. doi: 10.1109/ICECA55336.2022.10009130.

R. Kaur and N. Gupta, “Comprehending SMOTE Adaptations to Alleviate Imbalance in Intrusion Detection Systems,” in 2023 4th International Conference on Electronics and Sustainable Communication Systems, ICESC 2023 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2023, pp. 976–982. doi: 10.1109/ICESC57686.2023.10193257.

A. O. Widodo, B. Setiawan, and R. Indraswari, “Machine Learning-Based Intrusion Detection on Multi-Class Imbalanced Dataset Using SMOTE,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 578–583. doi: 10.1016/j.procs.2024.03.042.

M. W. Nawaz, R. Munawar, A. Mehmood, M. M. U. Rahman, and Q. H. Abbasi, “Multi-class Network Intrusion Detection with Class Imbalance via LSTM & SMOTE,” Oct. 2023, [Online]. Available: http://arxiv.org/abs/2310.01850

A. B. Hassanat, A. S. Tarawneh, S. S. Abed, G. A. Altarawneh, M. Alrashidi, and M. Alghamdi, “RDPVR: Random Data Partitioning with Voting Rule for Machine Learning from Class-Imbalanced Datasets,” Electronics (Switzerland), vol. 11, no. 2, Jan. 2022, doi: 10.3390/electronics11020228.

T. Fulazzaky, A. Saefuddin, and A. M. Soleh, “Evaluating Ensemble Learning Techniques for Class Imbalance in Machine Learning: A Comparative Analysis of Balanced Random Forest, SMOTE-RF, SMOTEBoost, and RUSBoost,” Scientific Journal of Informatics, vol. 11, no. 4, pp. 969–980, Dec. 2024, doi: 10.15294/sji.v11i4.15937.

A. Hafid, M. Rahouti, and M. Aledhari, “Optimizing Intrusion Detection in IoMT Networks Through Interpretable and Cost-Aware Machine Learning,” Mathematics, vol. 13, no. 10, May 2025, doi: 10.3390/math13101574.

C. Wang, C. Deng, and S. Wang, “Imbalance-XGBoost: leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost,” Pattern Recognit Lett, vol. 136, pp. 190–197, Aug. 2020, doi: 10.1016/j.patrec.2020.05.035.

B. Septian Cahya Putra, I. Tahyudin, B. Adhi Kusuma, and K. Nur Isnaini, “Efektivitas Algoritma Random Forest, XGBoost, dan Logistic Regression dalam Prediksi Penyakit Paru-paru The Effectiveness of Random Forest, XGBoost, and Logistic Regression Algorithms in Predicting Lung Disease,” 2024. [Online]. Available: https://www.kaggle.com/datasets/andot03bsrc/dataset-predic-terkena-penyakit-paruparu.

G. Liu, “Leveraging Machine Learning for Telecom Banking Card Fraud Detection: A Comparative Analysis of Logistic Regression, Random Forest, and XGBoost Models,” Computers and Artificial Intelligence, vol. 1, no. 1, pp. 13–27, Nov. 2024, doi: 10.70267/1cc7aw07.

A. H. Ali, M. Charfeddine, B. Ammar, and B. Ben Hamed, “Intrusion Detection Schemes Based on Synthetic Minority Oversampling Technique and Machine Learning Models,” in Proceedings - 2024 IEEE 27th International Symposium on Real-Time Distributed Computing, ISORC 2024, Institute of Electrical and Electronics Engineers Inc., 2024. doi: 10.1109/ISORC61049.2024.10551335.

K. Abhiram, H. Muthusamy, S. Ravindran, and V. Vijean, “A Comprehensive Survey of Intrusion Detection System Using Machine Learning and Deep Learning Approaches,” in 10th International Conference on Advanced Computing and Communication Systems, ICACCS 2024, Institute of Electrical and Electronics Engineers Inc., 2024, pp. 1927–1932. doi: 10.1109/ICACCS60874.2024.10717043.

A. Khediri, H. Slimi, A. Yahiaoui, M. Derdour, H. Bendjenna, and C. E. Ghenai, “Enhancing Machine Learning Model Interpretability in Intrusion Detection Systems through SHAP Explanations and LLM-Generated Descriptions,” in PAIS 2024 - Proceedings: 6th International Conference on Pattern Analysis and Intelligent Systems, Institute of Electrical and Electronics Engineers Inc., 2024. doi: 10.1109/PAIS62114.2024.10541168.

S. S. Panwar, Y. P. Raiwani, and L. S. Panwar, “An Intrusion Detection Model for CICIDS-2017 Dataset Using Machine Learning Algorithms,” in 2022 International Conference on Advances in Computing, Communication and Materials, ICACCM 2022, Institute of Electrical and Electronics Engineers Inc., 2022. doi: 10.1109/ICACCM56405.2022.10009400.

Thirumaraiselvi, Sreyaniketha, V. Rahul, and K. S. Tamilnilavan, “Enabling Robust Intrusion Detection in Network Traffic through an Integrated Machine Learning Framework,” in Proceedings - 2024 5th International Conference on Image Processing and Capsule Networks, ICIPCN 2024, Institute of Electrical and Electronics Engineers Inc., 2024, pp. 183–188. doi: 10.1109/ICIPCN63822.2024.00038.

C. E. Ben Ncir, M. A. Ben HajKacem, and M. Alattas, “Enhancing intrusion detection performance using explainable ensemble deep learning,” PeerJ Comput Sci, vol. 10, 2024, doi: 10.7717/PEERJ-CS.2289.

Additional Files

Published

2025-08-25

How to Cite

[1]
A. Maulana, S. Anam, and H. Aziz Bukhori, “Improving Lateral-Movement Intrusion Detection in Virtualized Networks using SHAP Feature Selection, SMOTE, and a Voting Ensemble Classifier”, J. Tek. Inform. (JUTIF), vol. 6, no. 4, Aug. 2025.