COMPARISON OF K-NEAREST NEIGHBOR AND SUPPORT VECTOR MACHINE ALGORITHM OPTIMIZATION WITH GRID SEARCH CV ON STROKE PREDICTION

  • Wahyu Aprilliandhika Informatics, Computer Science Faculty, Universitas Amikom Yogyakarta, Indonesia
  • Ferian Fauzi Abdulloh Informatics, Computer Science Faculty, Universitas Amikom Yogyakarta, Indonesia
Keywords: Comparison, GridSearchCV, K-Nearest Neighbor, Optimization, Support Vector Machine

Abstract

Stroke ranks second as the leading cause of death globally, with disability being the primary accompanying factor. The cause of death in stroke patients is due to the lack of an optimal stroke prediction system; therefore, identifying whether a patient is experiencing a stroke or not becomes the focus of this research. Thus, the objective of this study is to compare the performance of stroke prediction using two classification models, namely K-Nearest Neighbors (KNN) and Support Vector Machine (SVM), with and without using the GridSearchCV optimization technique. In this experiment, the dataset is processed and divided into training and testing data using the SMOTE oversampling technique. Initial testing is conducted without GridSearchCV. The results of the initial testing show that the KNN model performs better than SVM, with accuracies of 91% and 83%, respectively. After optimizing parameters using GridSearchCV, both models experience a significant performance improvement. The KNN model increases accuracy to 95% with precision of 91% and recall of 98%, while the SVM model increases accuracy to 94% with precision of 90% and recall of 99%. These results indicate that using GridSearchCV to optimize parameters of KNN and SVM models can significantly enhance stroke prediction performance. There are differences in precision and recall between KNN and SVM. The KNN model tends to have higher recall, while the SVM model has higher precision, and for accuracy, the KNN algorithm outperforms SVM in stroke prediction.

Downloads

Download data is not yet available.

References

N. Permatasari, “Perbandingan Stroke Non Hemoragik dengan Gangguan Motorik Pasien Memiliki Faktor Resiko Diabetes Melitus dan Hipertensi,” Jurnal Ilmiah Kesehatan Sandi Husada, vol. 9, no. 1, pp. 298-304, 2020, doi: 10.35816/jiskh.v11i1.273.

A. Byna dan M. Basit, “Penerapan Metode Adaboost Untuk Mengoptimasi Prediksi Penyakit Stroke Dengan Algoritma Naïve Bayes,” Jurnal Sisfokom (Sistem Informasi dan Komputer), vol. 9, no. 3, pp. 407-411, 2020, doi: 10.32736/sisfokom.v9i3.1023.

P. A. Setiawan, “Diagnosis dan Tatalaksana Stroke Hemoragik,” Jurnal Medika Hutama, vol. 3, no. 1, pp. 1660-1665, 2021.

K. Akmal, A. Faqih dan F. Dikananda, “Perbandingan Metode Algoritma Naive Bayes Dan K-Nearest Neighbors Untuk Klasifikasi Penyakit Stroke,” Jurnal Mahasiswa Teknik Informatika, vol. 7, no. 1, pp. 470-477, 2023, doi: 10.36040/jati.v7i1.6367.

Hozairi, Anwari dan S. Alim, “IMPLEMENTASI ORANGE DATA MINING UNTUK KLASIFIKASI KELULUSAN MAHASISWA DENGAN MODEL K-NEARESTNEIGHBOR, DECISION TREE SERTA NAIVE BAYES,” Jurnal Ilmiah NERO, vol. 6, no. 2, pp. 133-144, 2021, doi: 10.21107/nero.v6i2.237.

A. O. C. Pratiwi, “Klasifikasi Jenis Anggur Berdasarkan Bentuk Daun Menggunakan Convolutional Neural Network Dan K-Nearest Neighbor.,” Jurnal Ilmiah Teknik Informatika Dan Komunikasi, vol. 3, no. 2, pp. 201-224, 2023, doi: 10.55606/juitik.v3i2.535.

R. P. Kurniadi, R. Saedudin dan V. P. Widartha, “PERBANDINGAN AKURASI ALGORITMA K-NEAREST NEIGHBOR DAN LOGISTIC REGRESSION UNTUK KLASIFIKASI PENYAKIT DIABETES,” Universitas Telkom, S1 Sistem Informasi, Bandung, 2021.

S. Chowdhury dan M. P. Schoen, “Research Paper Classification using Supervised Machine Learning Techniques,” 2020 Intermountain Engineering, Technology and Computing (IETC), pp. 1-6, 2020, doi: 10.32628/IJSRCSEIT.

H. S. Wafa, A. I. Hadiana dan F. R. Umbara, “Prediksi Penyakit Diabetes Menggunakan Algoritma Support Vector Machine (SVM),” INFORMATICS AND DIGITAL EXPERT (INDEX), vol. 4, no. 1, pp. 40-45, 2022.

S. Dwiasnati dan Y. Devianto, “Optimasi Prediksi Bencana Banjir menggunakan Algoritma SVM untuk penentuan Daerah Rawan Bencana Banjir,” Prosiding SISFOTEK, vol. 5, no. 1, pp. 202-207, 2021.

G. N. AHMAD, H. FATIMA, SHAFIULLAH, A. S. SAIDI dan IMDADULLAH, “Efficient Medical Diagnosis of Human Heart Diseases Using Machine Learning Techniques With and Without GridSearchCV,” IEEE, vol. 20, pp. 80151-80173, 2022, doi: 10.1109/ACCESS.2022.3165792.

R. Alshammri, G. Alharbi, E. Alharbi dan I. Almubark, “Machine learning approaches to identify Parkinson's disease using voice signal features,” Frontiers, vol. 6, 2023, doi: 10.3389/frai.2023.1084001.

N. Chandrasekhar dan S. Peddakrishna, “Enhancing Heart Disease Prediction Accuracy through Machine Learning Techniques and Optimization,” Processes, vol. 11, no. 4, p. 1210, 2023, doi: 10.3390/pr11041210.

Sulistiana dan M. A. Muslim, “Support Vector Machine (SVM) Optimization Using Grid Search and Unigram to Improve E-Commerce Review Accuracy,” Jornal Of Soft Computing, vol. 1, no. 1, pp. 8-15, 2020, doi: 10.52465/joscex.v1i1.3.

T. A. Assegie, “An optimized K-Nearest Neighbor based breast cancer detection,” Journal of Robotics and Control, vol. 2, no. 3, pp. 115-118, 2021.

T. S. R. P. S. G. N. K. K. Tsehay Admassu Assegie, “Early Prediction of Gestational Diabetes with Parameter-Tuned K-Nearest Neighbor Classifier,” Jurnal of Robotics and Control, vol. 4, no. 4, pp. 452-457, 2023.

S. A. S. E. M. S. L. R. M. M. I. R. A. C. a. N. B. Alexey N. Beskopylny, “Concrete Strength Prediction Using Machine Learning Methods CatBoost, k-Nearest Neighbors, Support Vector Regression,” applied sciences, vol. 12, no. 21, p. 10864, 2022, doi: 10.3390/app122110864.

FEDESORIANO, “Stroke Prediction Dataset,” Kaggle, 2021. Tersedia pada: < https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset > [Diakses 20 Januari 2024].

A. K. M. &. A. A. K. Saleh H. Alhathloul, “Low visibility event prediction using random forest and K-nearest neighbor methods,” Theor Appl Climatol, vol. 155, pp. 1289-1300, 2023, doi: 10.1007/s00704-023-04697-6.

M. S. Irwanto, F. A. Bachtiar dan N. Yudistira, “KLASIFIKASI AKTIVITAS MANUSIA MENGGUNAKAN ALGORITME COMPUTED INPUT WEIGHT EXTREME LEARNING MACHINE DENGAN REDUKSI DIMENSI PRINCIPAL COMPONENT ANALYSIS,” Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), vol. 9, no. 6, pp. 1195-1202, 2022, doi: 10.25126/jtiik.2022965504.

R. N. Ikhsani dan F. F. Abdulloh, “Optimasi SVM dan Decision Tree Menggunakan SMOTE Untuk Mengklasifikasi Sentimen Masyarakat Mengenai Pinjaman Online,” Jurnal Media Informatika Budidarma, vol. 7, no. 4, pp. 1667-1677, 2023, doi: 10.30865/mib.v7i3.6368.

Rahmadini, E. E. LorencisLubis, A. Priansyah, Y. R.W.N dan T. Meutia, “PENERAPAN DATA MINING UNTUK MEMPREDIKSI HARGA BAHAN PANGAN DI INDONESIA MENGGUNAKAN ALGORITMA K-NEAREST NEIGHBOR,” Jurnal Mahasiswa Akuntansi Samudra, vol. 4, no. 4, pp. 223 -235, 2023.

A. S. Abiyyu dan K. M. Lhaksmana, “Perbandingan Metode Seleksi Fitur untuk Mengoptimasi Model Support Vector Machine dalam Memprediksi Turnover Pegawai,” e-Proceeding of Engineering, vol. 10, no. 2, p. 1921, 2023.

C. Chazar dan Widhiaputra, “Machine Learning Diagnosis Kanker Payudara Menggunakan Algoritma Support Vector Machine,” INFORMASI (Jurnal Informatika dan Sistem Informasi), vol. 12, no. 1, pp. 67-80, 2020, doi: 10.37424/informasi.v12i1.48.

Y. N. Fuadah, M. A. Pramudito dan K. M. Lim, “An Optimal Approach for Heart Sound Classification Using Grid Search in Hyperparameter Optimization of Machine Learning,” bioengineering, vol. 10, no. 1, p. 45, 2023, doi: 10.3390/bioengineering10010045.

C. Bigoni, A. Cadic-Melchior, T. Morishita dan F. C. Hummel, “Optimization of phase prediction for brain-state dependent stimulation: a grid-search approach,” Journal of Neural Engineering, vol. 20, no. 1, 2023, doi: 10.1088/1741-2552/acb1d8.

M. Azhari, Z. Situmorang dan R. Rosnelly, “Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4.5, Random Forest, SVM dan Naive Bayes,” Jurnal Media Informatika Budi Dharma, vol. 5, no. 2, pp. 640-651, 2021, doi: 10.30865/mib.v5i2.2937.

S. T. Kusuma dan T. B. Sasongko, “Optimasi K-Nearest Neighbor dengan Grid Search CV pada Prediksi Kanker Paru-Paru,” Indonesian Journal of Computer Science, vol. 12, no. 4, pp. 2162-2171, 2023, doi: 10.33022/ijcs.v12i4.3267.

P. Elisa dan A. R. Isnain, “COMPARISON OF RANDOM FOREST, SUPPORT VECTOR MACHINE AND NAIVE BAYES ALGORITHMS TO ANALYZE SENTIMENT TOWARDS MENTAL HEALTH STIGMA,” Jurnal Teknik Informatika (JUTIF), vol. 5, no. 1, pp. 321-329, 2024, doi: 10.52436/1.jutif.2024.5.1.1817.

Published
2024-07-24
How to Cite
[1]
W. Aprilliandhika and F. F. Abdulloh, “COMPARISON OF K-NEAREST NEIGHBOR AND SUPPORT VECTOR MACHINE ALGORITHM OPTIMIZATION WITH GRID SEARCH CV ON STROKE PREDICTION”, J. Tek. Inform. (JUTIF), vol. 5, no. 4, pp. 991-1000, Jul. 2024.