Optimization Of Hybrid K-Means–Naïve Bayes Using Optuna for Classification of Global Plastic Waste Management Levels

Aulya Fani Madani; Poningsih Poningsih; Zulia  Almaida; Widodo  Saputra

doi:10.52436/1.jutif.2026.7.2.5651

Authors

Aulya Fani Madani Information Systems, STIKOM Tunas Bangsa, Indonesia
Poningsih Master of Informatics Study Program, STIKOM Tunas Bangsa, Indonesia
Zulia Almaida Accounting Computerization, STIKOM Tunas Bangsa, Indonesia
Widodo Saputra Master of Informatics Study Program, STIKOM Tunas Bangsa, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.2.5651

Keywords:

clustering, naïve bayes, optuna, plastic waste management, classification

Abstract

The rapid growth of plastic waste has become a serious global environmental challenge, while existing waste management analysis methods often struggle to handle large and heterogeneous environmental datasets. This study aims to improve the classification of global plastic waste management performance by integrating K-Means clustering and Naïve Bayes with Optuna-based hyperparameter optimization. Using a dataset of global plastic waste indicators from multiple countries during 2020–2024, K-Means is first applied to generate waste management level clusters, which are then classified using Naïve Bayes. The hybrid model is further optimized by tuning the var_smoothing parameter using Optuna. Experimental results show that the hybrid approach improves classification performance compared to the baseline Naïve Bayes model, while the optimized model increases accuracy from 89% to 95% along with improvements in precision, recall, F1-score, and ROC-AUC. These results indicate that combining clustering-based labeling with automated hyperparameter optimization can enhance the reliability of machine learning models for large-scale environmental data analysis. Therefore, the proposed approach can support more accurate evaluation of global plastic waste management and assist data-driven environmental policy development.

Downloads

Download data is not yet available.

References

World Bank, “Tackling the plastics pollution crisis by channeling private capital to projects that reduce plastic waste.,” World Bank. Accessed: Nov. 13, 2025. [Online]. Available: https://www.worldbank.org/en/news/feature/2024/01/25/tackling-the-plastics-pollution-crisis-by-channeling-private-capital-to-projects-that-reduce-plastic-waste?utm_source=chatgpt.com

OECD, Global Plastics Outlook. 2022. doi: 10.1787/de747aef-en.

A. Yuliarsono, “Analisis Strategi Pemasaran dan Pengolahan Daur Ulang Limbah Plastik,” remik, vol. 9, no. 3, pp. 780–790, Aug. 2025, doi: 10.33395/remik.v9i3.14871.

M. R. Kelly, M. R. Cordova, S. Jobling, and R. C. Thompson, “Meta-analysis of the spatial distribution and composition of plastic macro-debris in Indonesia,” Reg. Stud. Mar. Sci., vol. 90, no. June, p. 104460, 2025, doi: 10.1016/j.rsma.2025.104460.

M. A. Septiono, “Indonesia Waste Trade Updates: Focusing on Plastic and Paper Waste in Indonesia Grid-Arendal,” no. November 2022, p. 33, 2022, doi: 10.13140/RG.2.2.12149.45280.

R. Aspiah and Taghfirul Azhima Yoga Siswa, “Implementasi Correlation Based Feature Selection (Cfs) Untuk Peningkatan Akurasi Algoritma C4.5 Dalam Prediksi Performa Akademik Mahasiswa Berbasis Learning Management System,” J. Ilm. Betrik, vol. 13, no. 2, pp. 199–207, Aug. 2022, doi: 10.36050/betrik.v13i2.523.

M. Gibril and R. Selamat, “Sistem Deteksi Fraud Menggunakan Data Mining, Data Warehouse, dan OLAP di Bank of India Indonesia,” J. Compr. Sci., vol. 4, no. 8, pp. 2570–2580, 2025, doi: 10.59188/jcs.v4i8.3543.

R. Y. Hayuningtyas and R. Sari, “Implementasi Data Mining Dengan Algoritma Multiple Linear Regression Untuk Memprediksi Penyakit Diabetes,” J. Tek. Komput., vol. 8, no. 1, pp. 40–44, Jan. 2022, doi: 10.31294/jtk.v8i1.11552.

A. Prasetio, M. M. Effendi, and M. N. Dwi M, “Analisis Gempa Bumi Di Indonesia Dengan Metode Clustering,” Bull. Inf. Technol., vol. 4, no. 3, pp. 338–343, Sep. 2023, doi: 10.47065/bit.v4i3.820.

R. Nugraha, N. Suarna, I. Ali, and D. Rohman, “Optimasi Pengelolaan Sampah Melalui Model Pengelompokan Dengan Algoritma K-Means,” J. Inform. dan Tek. Elektro Terap., vol. 13, no. 1, pp. 646–652, 2025, doi: 10.23960/jitet.v13i1.5694.

C. Darmawan, Y. Setiyawan, R. A. Prasetyo, and S. K. Qurrota’Ayyun, “Penerapan Algoritma K-means dan Metode Elbow Untuk Clustering Tingkat Pencemaran Sampah Plastik pada Kabupaten/Kota di Seluruh Indonesia,” G-Tech J. Teknol. Terap., vol. 8, no. 1, pp. 349–358, Jan. 2024, doi: 10.33379/gtech.v8i1.3637.

Isni Rinjani, Saeful Anwar, and Ruli Herdiana, “PENGELOMPOKAN DAERAH BENCANA ALAM MENGGUNAKAN ALGORITMA K-MEANS CLUSTERING,” J. Ilm. Sist. Inf. dan Ilmu Komput., vol. 3, no. 1, pp. 35–51, Mar. 2023, doi: 10.55606/juisik.v3i1.417.

A. N. B. Prasetyo, M. Maimunah, and P. Sukmasetya, “K-Means Clustering Method for Determining Waste Transportation Routes to Landfill,” J. Ris. Inform., vol. 5, no. 3, pp. 277–284, 2023, doi: 10.34288/jri.v5i3.219.

M. Hanafi, B. Warsito, and R. Gernowo, “Sistem Informasi Manajemen Pengumpulan dan Pengangkutan Sampah Padat dengan Efisiensi Rute Menggunakan K-Means Clustering dan Travelling Salesman Problem,” J. Sist. Inf. Bisnis, vol. 12, no. 2, pp. 106–115, 2022, doi: 10.21456/vol12iss2pp106-115.

I. F. Ashari, E. Dwi Nugroho, R. Baraku, I. Novri Yanda, and R. Liwardana, “Analysis of Elbow, Silhouette, Davies-Bouldin, Calinski-Harabasz, and Rand-Index Evaluation on K-Means Algorithm for Classifying Flood-Affected Areas in Jakarta,” J. Appl. Informatics Comput., vol. 7, no. 1, pp. 89–97, Jul. 2023, doi: 10.30871/jaic.v7i1.4947.

A. A. Alimun, H. Harlinda, and H. Azis, “Klasifikas Sampah Menggunakan Metode Naive Bayes,” LINIER Lit. Inform. dan Komput., vol. 2, no. 3, pp. 459–466, Oct. 2025, doi: 10.33096/linier.v2i3.3155.

Muhammad Satria Nugraha, Imiel Ardhanenggar Tallane, Nabila Nur Fadhilah, Putri Citra Arrahma, Rifa Abdussalam, and Anna Dina Kalifia, “Analisis Data Sampah Plastik Dunia Pada Tahun 2023 Dengan Metode Naive Bayes,” J. Teknol. Komput. dan Inf., vol. 12, no. 2, pp. 152–157, 2024, doi: 10.52072/jutekinf.v12i2.1165.

Sugeng Dwi Budi Priantoro, M Ghofar Rohman, and Moh Rosidi Zamroni, “Klasifikasi Kualitas Udara Dengan Metode Naive Bayes Berbasis Web,” Rabit J. Teknol. dan Sist. Inf. Univrab, vol. 10, no. 2, pp. 1024–1035, 2025, doi: 10.36341/rabit.v10i2.6447.

N. W. Wardani, P. G. S. C. Nugraha, and G. S. Mahendra, “Implementasi Naïve Bayes Pada Data Mining Untuk Mengklasifikasikan Penjualan Barang Terlaris Pada Perusahaan Ritel,” JST (Jurnal Sains dan Teknol., vol. 12, no. 3, pp. 656–668, 2024, doi: 10.23887/jstundiksha.v12i3.38605.

D. D. Purwanto and E. S. Honggara, “Klasifikasi Kategori Hasil Perhitungan Indeks Standar Pencemaran Udara dengan Gausian Naïve Bayes (Studi Kasus: ISPU DKI Jakarta 2020),” J. Intell. Syst. Comput., vol. 4, no. 2, pp. 102–108, 2022, doi: 10.52985/insyst.v4i2.259.

I. M. Sinatrya, A. B. Pohan, Y. Yunita, H. Amalia, and A. F. Lestari, “Penerapan Integrasi Algoritma K-Means Dan Naïve Bayes Untuk Klasifikasi Wilayah Rawan Banjir Di Jakarta,” Comput. Sci., vol. 5, no. 2, pp. 67–76, 2025, doi: 10.31294/coscience.v5i2.6900.

F. M. Sarimole and L. Nurmayanti, “Sistem Data Mining Penentuan Prioritas terhadap Penerima Bantuan Bencana Banjir dengan Metode Naive Bayes dan Klusterisasi K-Means (Studi Kasus: Wilayah Cengkareng 2025),” J. Pengabdi. Nas. Indones., vol. 6, no. 3, pp. 685–697, 2025, doi: 10.63447/jpni.v6i3.1609.

V. R. Prasetyo, G. Erlangga, and D. A. Prima, “Analisis Sentimen untuk Identifikasi Bantuan Korban Bencana Alam berdasarkan Data di Twitter Menggunakan Metode K-Means dan Naive Bayes,” J. Teknol. Inf. dan Ilmu Komput., vol. 10, no. 5, pp. 1055–1062, 2023, doi: 10.25126/jtiik.2023107077.

I. Tahyudin, A. Tikaningsih, P. Lestari, E. Winarto, and N. Hassa, “Optimizing Stroke Mortality Prediction: A Comprehensive Study on Risk Factors Analysis and Hyperparameter Tuning Techniques,” TEM J., vol. 13, no. 1, pp. 705–717, 2024, doi: 10.18421/TEM131-74.

A. Tikaningsih, P. Lestari, A. Nurhopipah, I. Tahyudin, E. Winarto, and N. Hassa, “Optuna Based Hyperparameter Tuning for Improving the Performance Prediction Mortality and Hospital Length of Stay for Stroke Patients,” Telematika, vol. 17, no. 1, pp. 1–16, Feb. 2024, doi: 10.35671/telematika.v17i1.2816.

L.-H. Lai et al., “The Use of Machine Learning Models with Optuna in Disease Prediction,” Electronics, vol. 13, no. 23, p. 4775, Dec. 2024, doi: 10.3390/electronics13234775.

J. B. Adem et al., “Explainable machine learning algorithms to identify predictors of intention to use family planning among women of reproductive-age in Ethiopia: Evidence from the Performance Monitoring and Accountability (PMA) 2021 survey data set,” BMJ Public Heal., vol. 3, no. 1, p. e000962, 2025, doi: 10.1136/bmjph-2024-000962.

H. Gözgöz, O. Orhan, B. Akan Konuk, and P. Akan, “A machine learning model for predicting oligoclonal band positivity using routine cerebrospinal fluid and serum biochemical markers,” Am. J. Clin. Pathol., vol. 164, no. 6, pp. 933–945, 2025, doi: 10.1093/ajcp/aqaf119.

D. Papakyriakou and I. S. Barbounakis, “Data Mining Methods: A Review,” Int. J. Comput. Appl., vol. 183, no. 48, pp. 5–19, 2022, doi: 10.5120/ijca2022921884.

K. Yadav, “Global Environmental Impact,” Kaggle. Accessed: Nov. 14, 2025. [Online]. Available: https://www.kaggle.com/datasets/khushikyad001/global-environmental-impact?resource=download

Noviyanto, M. Wahyudi, and S. Sumanto, “Comparison of Supervised Learning Classification Methods on Accreditation Data of Private Higher Education Institutions,” Paradig. - J. Komput. dan Inform., vol. 26, no. 1, pp. 24–29, 2024, doi: 10.31294/p.v26i1.3306.

M. F. M. Khalik and F. Arifin, “Klasifikasi Indeks Kedalaman Kemiskinan Provinsi Sulawesi Selatan Berbasis Decision Tree, K-Nearest Neighbor, Naive Bayes, Neural Network, dan Random Forest,” J. Edukasi dan Penelit. Inform., vol. 9, no. 2, p. 282, 2023, doi: 10.26418/jp.v9i2.67492.

D. B. A and N. Mangla, “A Novel Network Intrusion Detection System Based on Semi-Supervised Approach for IoT,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 4, pp. 207–216, 2023, doi: 10.14569/IJACSA.2023.0140424.

A. N. Azmi, A. L. O. Siregar, F. I. Lesmana, A. A. Nasir, and F. Kartiasih, “Implementasi machine learning dalam pengelompokan provinsi di Indonesia berdasarkan data pencemaran lingkungan hidup,” e-Jurnal Sumberd. dan Lingkung., vol. 14, no. 2, pp. 113–128, 2025, [Online]. Available: https://doi.org/10.22437/jesl.v14i2.37366

M. Nurrohman, M. Maimunah, and P. Sukmasetya, “Sistem Klasterisasi Volume Sampah Organik di Kota Magelang menggunakan K-Means,” TEMATIK, vol. 10, no. 1, pp. 146–153, Jun. 2023, doi: 10.38204/tematik.v10i1.1338.

F. Salsabila, T. Ridwan, and H. H, “Analisa Volume Penyebaran Sampah Di Karawang Menggunakan Algoritma K-Means Clustering,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 2, 2024, doi: 10.23960/jitet.v12i2.4226.

Z. Zulkipli, K. Kusrini, and S. Sudarmawan, “Prediksi Tingkat Kesehatan Lingkungan Masyarakat Dalam Program Sustainable Development Goals Menggunakan Algoritma Naive Bayes,” Infotek J. Inform. dan Teknol., vol. 6, no. 2, pp. 431–442, Jul. 2023, doi: 10.29408/jit.v6i2.18776.

A. Efendi, I. Fitri, and G. W. Nurcahyo, “Development of a machine learning model with optuna and ensemble learning to improve performance on multiple datasets,” Indones. J. Electr. Eng. Comput. Sci., vol. 41, no. 1, p. 375, Jan. 2026, doi: 10.11591/ijeecs.v41.i1.pp375-386.

S. Samuel and D. Mietchen, “Computational reproducibility of Jupyter notebooks from biomedical publications,” Gigascience, vol. 13, pp. 1–23, 2024, doi: 10.1093/gigascience/giad113.

A. F. Fadhilah, A. R. Juwita, Y. E. Wicaksana, and T. Al Mudzakir, “Air Quality Classification Using Naive Bayes Algorithm With SMOTE Technique Based on ISPU Data,” JISA(Jurnal Inform. dan Sains), vol. 8, no. 1, pp. 16–22, Jun. 2025, doi: 10.31326/jisa.v8i1.2181.

D. Barber, “Penilaian Kualitas Udara Dan Analisis Polusi Berbasis Algoritma Naive Bayesdan Klustrerisasi Data Dengan K-Means,” Bayesian Reason. Mach. Learn., vol. 13, no. 3, pp. 243–255, 2012, doi: 10.1017/cbo9780511804779.014.

J. S. Mboli and O. A. Ogungbemi, “AI-Enabled Waste Classification as a Data-Driven Decision Support Tool for Circular Economy and Urban Sustainability,” in 2025 IEEE International Smart Cities Conference (ISC2), IEEE, Oct. 2025, pp. 1–6. doi: 10.1109/ISC266238.2025.11293327.