Optimizing Breast Cancer Classification: SVM and Random Forest with Hybrid Hyperparameter Tuning and Feature Selection

Authors

  • Adil Setiawan Department of Computer Science, Universitas Potensi Utama, Indonesia
  • Soeheri Department of Computer Science, Universitas Potensi Utama, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.3.5720

Keywords:

Breast Cancer Classification, Feature Selection, Hyperparameter Tuning, Random Forest, Support Vector Machine

Abstract

Breast cancer remains one of the leading causes of cancer-related mortality among women worldwide, underscoring the urgent need for early, accurate, and reliable diagnostic support systems. This study proposes an optimized breast cancer classification framework using Support Vector Machine (SVM) and Random Forest (RF) models enhanced through hybrid hyperparameter tuning and feature selection. The Breast Cancer Wisconsin (Diagnostic) dataset, comprising 569 samples with 30 numerical features derived from Fine Needle Aspirate (FNA) examinations, was utilized in this research. Feature selection was conducted using Random Forest feature importance to identify the most relevant diagnostic attributes and reduce dimensionality. Hybrid hyperparameter tuning was implemented using GridSearchCV combined with 5-fold cross-validation to obtain optimal model configurations. Model performance was evaluated using accuracy, malignant-class recall, confusion matrix analysis, and Receiver Operating Characteristic–Area Under the Curve (ROC–AUC). Experimental results show that the optimized SVM model achieved significant improvements in accuracy, recall, and ROC–AUC compared to baseline models, indicating enhanced sensitivity and discrimination capability, while the Random Forest model maintained stable performance with marginal gains after optimization. These findings highlight the critical importance of systematic optimization strategies in improving diagnostic safety and reducing false negatives, thereby contributing to the development of more reliable and clinically applicable machine learning-based medical decision support systems.

Downloads

Download data is not yet available.

References

A. S. Boddu and J. A. Jan, “A systematic review of machine learning algorithms for breast cancer detection,” Tissue and Cell, vol. 95, Aug. 2025, Art. no. 102929. doi: 10.1016/j.tice.2025.102929.

D. Añez et al., “Artificial intelligence pipeline for mammography-based breast cancer detection: An integrated systematic review and large-scale experimental validation,” Medicina, vol. 61, no. 12, p. 2237, Dec. 2025. doi: 10.3390/medicina61122237.

H. Chen et al., “Classification prediction of breast cancer based on machine learning,” Computational Intelligence and Neuroscience, 2023, Art. no. 6530719. doi: 10.1155/2023/6530719.

H. Qi et al., “Machine learning-based models for prediction of breast cancer recurrence risk,” BMC Medical Informatics and Decision Making, vol. 23, 2023. doi: 10.1186/s12911-023-02377-z.

S. Ayanouz et al., “Machine learning algorithms for breast cancer analysis,” IAES International Journal of Artificial Intelligence, vol. 13, no. 4, pp. 4372–4379, 2024. doi: 10.11591/ijai.v13.i4.pp4372-4379.

M. A. Elsadig et al., “Breast cancer detection using machine learning approaches,” International Journal of Electrical and Computer Engineering, vol. 13, no. 1, pp. 736–745, 2022. doi: 10.11591/ijece.v13i1.pp736-745.

R. Tachicart et al., “Comparative study of machine learning algorithms for breast cancer diagnosis,” Journal of Medical Artificial Intelligence, 2025. doi: 10.21037/jmai-24-368.

J. Wang and L. Li, “Hybrid deep learning and machine learning model for breast cancer detection,” Computers in Biology and Medicine, vol. 158, 2023, Art. no. 106612. doi: 10.1016/j.compbiomed.2023.106612.

Y. Lee et al., “CNN models for histopathological breast cancer classification,” IEEE Access, vol. 11, 2023. doi: 10.1109/ACCESS.2023.3247211.

A. F. Agarap, “On breast cancer detection using machine learning,” arXiv preprint, 2017. doi: 10.48550/arXiv.1711.07831.

F. J. Kaunang et al., “Breast cancer detection using decision tree and random forest,” Journal of Applied Informatics and Computing, 2025. doi: 10.30871/jaic.v9i2.9073.

C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. doi: 10.1007/BF00994018.

L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. doi: 10.1023/A:1010933404324.

S. Singh, “Breast cancer prediction using machine learning,” International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 2024. doi: 10.32628/CSEIT206457.

T. Sun, “Breast cancer prediction based on multiple ML algorithms,” Highlights in Science, Engineering and Technology, 2024. doi: 10.54097/0yvhen56.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. New York, NY, USA: Springer, 2009.

M. Sugimoto et al., “Machine learning techniques for breast cancer diagnosis,” Annals of Breast Surgery, 2023. doi: 10.21037/abs-21-63.

A. Gupta and P. Singh, “XGBoost-based breast cancer classification,” Journal of Medical Systems, 2023. doi: 10.1007/s10916-023-01948-0.

B. Jain and N. Singla, “Breast cancer detection using ML algorithms,” Journal of Computers, Mechanical and Management, 2023. doi: 10.57159/gadl.jcmm.2.6.230109.

I. Buyung et al., “Effective breast cancer detection using deep learning,” Jurnal Ilmu Pengetahuan dan Teknologi Komputer, vol. 8, no. 2, 2024. doi: 10.33480/jitk.v8i2.4077.

G. Singh, “Breast cancer prediction using ML,” International Journal of Engineering and Computer Science, vol. 13, no. 3, 2024. doi: 10.18535/ijecs/v13i03.4794.

A. E. Kılıç and M. Karakoyun, “Breast cancer detection using ML algorithms,” International Journal of Advanced Natural Sciences and Engineering Researches, 2024. doi: 10.59287/ijanser.401.

H. Chen et al., “Breast cancer dataset modeling study,” Computational Intelligence and Neuroscience, 2023. doi: 10.1155/2023/6530719.

E. Khalil et al., “Ensemble machine learning for improved breast cancer prediction,” BMC Bioinformatics, 2023. doi: 10.1186/s12859-023-05261-4.

P. Patel and T. Shah, “Feature selection in breast cancer classification,” Expert Systems with Applications, 2023. doi: 10.1016/j.eswa.2021.114741.

F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

Additional Files

Published

2026-06-15

How to Cite

[1]
A. Setiawan and S. Soeheri, “Optimizing Breast Cancer Classification: SVM and Random Forest with Hybrid Hyperparameter Tuning and Feature Selection”, J. Tek. Inform. (JUTIF), vol. 7, no. 3, pp. 2778–2789, Jun. 2026.