A Comparative Analysis of Hyperparameter-Tuned XGBoost and LightGBM for Multiclass Rainfall Classification in Jakarta

Cokorda Gde Lanang  Pringandana; Kusnawi  Kusnawi

doi:10.52436/1.jutif.2025.6.4.4965

Authors

Cokorda Gde Lanang Pringandana Informatics, Faculty of Computer Science, Universitas Amikom Yogyakarta, Indonesia
Kusnawi Informatics, Faculty of Computer Science, Universitas Amikom Yogyakarta, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2025.6.4.4965

Keywords:

Classification, Hyperparameter Tuning, LightGBM, Machine Learning, Rainfall Prediction, XGBoost

Abstract

The increasing frequency of extreme weather events in Jakarta has disrupted daily life and critical infrastructure, highlighting the urgent need for accurate rainfall prediction models to support disaster mitigation and early warning systems. This study aims to evaluate and compare the performance of two machine learning algorithms Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) for multiclass rainfall classification using historical meteorological data. The dataset, which includes features such as temperature, humidity, wind speed, and rainfall, was preprocessed through mean imputation, oversampling to address class imbalance, one-hot encoding, and feature engineering. Both models were trained and tuned using RandomizedSearchCV and assessed through cross-validation and independent testing. The results show that XGBoost consistently outperformed LightGBM, achieving 94% accuracy compared to 91%. Furthermore, XGBoost demonstrated higher precision, recall, F1-score, and specificity across all rainfall categories, resulting in fewer misclassifications and more stable predictions. Confusion matrices confirmed its superior ability to distinguish between similar weather conditions such as cloudy and rainy classes. These findings indicate that XGBoost is more effective in capturing nonlinear interactions between weather features and is therefore better suited for use in complex tropical climates. The study concludes that XGBoost is the more reliable model and recommends its integration into real-time early warning systems to improve climate resilience and disaster preparedness in urban areas like Jakarta that are increasingly affected by climate variability.

Downloads

Download data is not yet available.

References

R. L. Melanwati, E. Sumarminingsih, and H. Pramoedyo, “Transformasi Kota Cerdas dalam Mitigasi Banjir: Pemodelan Curah Hujan DKI Jakarta dengan Pendekatan Spatial Vector Autoregressive (SpVAR) dan Pemetaan Bobot Queen Contiguity,” J. Teknol. Inf. dan Ilmu Komput., vol. 10, no. 6, pp. 1285–1294, 2023, doi: 10.25126/jtiik.1067537.

F. Hamami and I. A. Dahlan, “Klasifikasi Cuaca Provinsi Dki Jakarta Menggunakan Algoritma Random Forest Dengan Teknik Oversampling,” J. Teknoinfo, vol. 16, no. 1, p. 87, 2022, doi: 10.33365/jti.v16i1.1533.

B. Adiyasa et al., “Deteksi Bencana Banjir Berdasarkan Data Curah Hujan Di Daerah Jakarta Menggunakan Logistic Regression,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 9, no. 2, pp. 1–8, 2025, [Online]. Available: http://j-ptiik.ub.ac.id

R. Fredyan, M. R. N. Majiid, and G. P. Kusuma, “Spatiotemporal Analysis for Rainfall Prediction Using Extreme Learning Machine Cluster,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 13, no. 6, pp. 2240–2248, 2023, doi: 10.18517/ijaseit.13.6.18214.

M. Mutiara Ramadita and M. Y. Wijaya, “Prediksi Curah Hujan di Jakarta Menggunakan Model Hybrid (DWT-SVR-Prophet),” Indones. J. Comput. Sci., vol. 13, no. 5, pp. 1–16, 2024, doi: https://doi.org/10.33022/ijcs.v1.

C. R. Malino and M. Arsyad, “Analisis Parameter Curah Hujan dan Suhu Udara di Kota Makassar Terkait Fenomena Perubahan Iklim,” J. Sains dan Pendidik. Fis., vol. 17, no. 2, pp. 139–145, 2021, doi: 10.35580/jspf.v17i2.22167.

D. A. S. Pertiwi, S. Sutisna, M. Supriyatno, Y. Norman, and J. A. I. Paski, “Kajian Perubahan Distribusi Frekuensi Curah Hujan Di Jakarta Periode 1991 - 2020,” J. Geogr. Edukasi dan Lingkung., vol. 8, no. 2, pp. 183–191, 2024, doi: 10.22236/jgel.v8i2.12882.

A. Diando, L. M. Limantara, and S. Wahyuni, “Estimasi Tinggi Curah Hujan dari Data Klimatologi Menggunakan Model Artificial Neural Network (ANN) di Jakarta Pusat, Provinsi DKI Jakarta,” J. Teknol. dan Rekayasa Sumber Daya Air, vol. 4, no. 1, pp. 15–24, 2023, doi: 10.21776/ub.jtresda.2024.004.01.002.

N. Anggraini, S. J. Putra, L. K. Wardhani, F. D. U. Arif, N. Hakiem, and I. M. Shofi, “A Comparative Analysis of Random Forest, XGBoost, and LightGBM Algorithms for Emotion Classification in Reddit Comments,” J. Tek. Inform., vol. 17, no. 1, pp. 88–97, 2024, doi: 10.15408/jti.v17i1.38651.

A. Yasper, D. Handoko, M. Putra, H. K. Aliwarga, and M. S. R. Rosid, “Hyperparameters Optimization in XGBoost Model for Rainfall Estimation: A Case Study in Pontianak City,” J. Penelit. Pendidik. IPA, vol. 9, no. 9, pp. 7113–7121, 2023, doi: 10.29303/jppipa.v9i9.3890.

M. F. Asnawi, H. H. Bisono, and M. A. Megantara, “Aplikasi Prediksi Banjir Menggunakan Algoritma XGBoost Berbasis Website,” J. Econ. Manag. Account. Technol., vol. 7, no. 2, pp. 379–389, 2024, doi: 10.32500/jematech.v7i2.7644.

I. Maulita, C. R. A. Widiawati, and A. M. Wahid, “Analisis Komparatif Linear Regression, Random Forest, dan Gradient Boosting untuk Prediksi Banjir,” J. Pendidik. Dan Teknol. Indones., vol. 4, no. 8, pp. 369–379, 2024, [Online]. Available: https://doi.org/10.52436/1.jpti.599

T. S. Wibawa, N. K. Ningrum, and A. Syahreza, “Comparison of CatBoost and LightGBM Models for Air Humidity Prediction,” J. Appl. Informatics Comput., vol. 9, no. 3, pp. 803–809, 2025, doi: doi.org/10.30871/jaic.v9i3.9570.

P. Septiana Rizky, R. Haiban Hirzi, and U. Hidayaturrohman, “Perbandingan Metode LightGBM dan XGBoost dalam Menangani Data dengan Kelas Tidak Seimbang,” J Stat. J. Ilm. Teor. dan Apl. Stat., vol. 15, no. 2, pp. 228–236, 2022, doi: 10.36456/jstat.vol15.no2.a5548.

K. Handayani and B. Lailiah, “Comparison of XGboost, Extra Trees, and LightGBM with SMOTE for Fetal Health Classification,” Sistemasi, vol. 13, no. 3, p. 980, 2024, doi: 10.32520/stmsi.v13i3.3646.

A. Fauziah, H. Hermanto, and M. A. Sukmarini, “Extreme Gradient Boosting pada Peramalan Pola Curah Hujan Bulanan Kabupaten Banyuwangi,” J. Kridatama Sains Dan Teknol., vol. 6, no. 02, pp. 430–440, 2024, doi: 10.53863/kst.v6i02.1154.

V. No, A. Syahreza, N. K. Ningrum, and M. A. Syahrazy, “Perbandingan Kinerja Model Prediksi Cuaca : Random Forest , Support Vector Regression , dan XGBoost,” J. Pendidik. Inform., vol. 8, no. 2, pp. 526–534, 2024, doi: 10.29408/edumatic.v8i2.27640.

M. K. H. Maharina, Tukino Paryono, Ahmad Fauzi, Jamaludin Indra, Sihabudin and L. T. Rizki, “Machine Learning Models for Predicting Flood Events Using Weather Data: An Evaluation of Logistic Regression, LightGBM, and XGBoost,” J. Appl. Data Sci., vol. 6, no. 1, pp. 496–507, 2025, doi: 10.47738/jads.v6i1.503.

A. Wijayanto, A. Sugiharto, and R. Santoso, “Identifikasi Dini Curah Hujan Berpotensi Banjir Menggunakan Algoritma Long Short-Term Memory (Lstm) Dan Isolation Forest,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 3, pp. 637–646, 2024, doi: 10.25126/jtiik.938718.

H. Gani, A. D. Damayanti, M. M. Mubarak, and H. Gani, “An Explainable Machine Learning Model to Explain the Influential Climate Parameters Based on Rainfall Prediction,” J. ITMedia Inf. IT STMIK Handayani, vol. 15, no. 2, pp. 98–110, 2024, doi: doi.org/10.37639/jti.v15i2.378.

M. Hermansyah, A. Saikhu, and B. Amaliah, “Pemodelan Data Radiosonde Menggunakan Stacking Ensemble Untuk Klasifikasi Hujan,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 10, no. 2, pp. 1678–1687, 2025.

D. Sangaji and T. Sutabri, “Analisis XGBoost dan Random Forest untuk Prediksi Curah Hujan dalam Mendukung Mitigasi Karhutla,” J. Pustaka AI, vol. 5, no. 1, pp. 13–18, 2025, doi: doi.org/10.55382/jurnalpustakaai.v5i1.905.

V. N. Juli, M. K. Aulia, E. Utaminingsih, and N. Prihatin, “Model Prediksi Risiko Kesehatan Perkotaan Berbasis Lingkungan dengan XGBoost,” Comput. Sci., vol. 5, no. 2, pp. 95–102, 2025.

N. Alamsyah, B. Budiman, T. P. Yoga, and R. Y. R. Alamsyah, “Xgboost Hyperparameter Optimization Using Randomizedsearchcv for Accurate Forest Fire Drought Condition Prediction,” J. Pilar Nusa Mandiri, vol. 20, no. 2, pp. 103–110, 2024, doi: 10.33480/pilar.v20i2.5569.

A. F. B. Sajiwo, B. Rahmat, and A. Junaidi, “Klasifikasi Indeks Standar Pencemaran Udaran (Ispu) Menggunakan Algoritma Xgboost Dengan Teknik Imbalanced Data (Smote),” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 3, 2024, doi: 10.23960/jitet.v12i3.4699.

M. Anwar Sadat, P. Pujiono, A. Pambudi, and S. Ibad, “Comparison of Algorithm Between Classification & Regression Trees and Support Vector Machine in Determining Student Acceptance in State Universities,” J. Tek. Inform., vol. 4, no. 6, pp. 1589–1604, 2024, doi: 10.52436/1.jutif.2023.4.6.1565.

A. Diastama et al., “Sentiment Analysis Classification In Women ’ S E-Commerce Reviews Klasifikasi Sentimen Analisis Pada Women ’ S E -Commerce Reviews Dengan Pendekatan Machine Learning,” J. Tek. Inform., vol. 5, no. 6, pp. 1549–1559, 2024.

D. Rifaldi, Abdul Fadlil, and Herman, “Teknik Preprocessing Pada Text Mining Menggunakan Data Tweet ‘Mental Health,’” J. Pendidik. Teknol. Inf., vol. 3, no. 2, pp. 161–171, 2023, doi: 10.51454/decode.v3i2.131.

T. Gori, A. Sunyoto, and H. Al Fatta, “Preprocessing Data dan Klasifikasi untuk Prediksi Kinerja Akademik Siswa,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 1, pp. 215–224, 2024, doi: 10.25126/jtiik.20241118074.

Z. Z. Alkaf et al., “Unraveling Of Men’S Fragrance Preferences On Online Marketplaces : A Machine Learning Study Using Dbscan Clustering,” J. Tek. Inform., vol. 5, no. 6, pp. 1913–1920, 2025, doi: doi.org/10.52436/1.jutif.2024.5.6.4187.

A. Agustiningsih, Y. Findawati, and I. Alnarus Kautsar, “Classification of Vocational High School Graduates’ Ability in Industry Using Extreme Gradient Boosting (Xgboost), Random Forest, and Logistic Regression,” J. Tek. Inform., vol. 4, no. 4, pp. 977–985, 2023, doi: 10.52436/1.jutif.2023.4.4.945.

A. I. Pradana et al., “Perbandingan Data Untuk Memprediksi Ketepatan Studi Berdasarkan Atribut Keluarga Menggunakan Machine Learning,” J. Inform., vol. 8, no. 2, pp. 221–228, 2024, doi: 10.31000/jika.v8i2.10752.

R. T. P. Sudewo, Y. Pratama, and E. Yanti, “Analisis Data Mining Untuk Prediksi Kanker Payudara Menggunakan Algoritma Klasifikasi,” J. Pustaka Data, vol. 3, no. 2, pp. 62–69, 2023, doi: 10.55382/jurnalpustakadata.v3i2.656.

R. N. Irawan, K. M. Hindrayani, and M. Idhom, “Penerapan Cross Validation sebagai Analisis Sentimen Pelayanan Publik Kereta Api Lokal Daop 8 Menggunakan Metode Multinomial Naïve Bayes,” G-Tech J. Teknol. Terap., vol. 8, no. 2, pp. 954–963, 2024, doi: 10.33379/gtech.v8i2.4117.

D. S. Bhakti, A. Prasetyo, P. Arsi, C. S. Faculty, and U. A. Purwokerto, “Implementation of Hyperparameter Tuning in Random Forest Implementasi Hyperparameter Tuning Pada Algoritma Random,” J. Tek. Inform., vol. 5, no. 4, pp. 63–69, 2024, doi: doi.org/10.52436/1.jutif.2024.5.4.2032.

A. Bengnga and R. Ishak, “Penerapan XGBoost untuk Seleksi Atribut pada K-Means dalam Clustering Penerima KIP Kuliah,” Jambura J. Electr. Electron. Eng., vol. 5, no. 2, pp. 192–196, 2023, doi: 10.37905/jjeee.v5i2.20253.

R. G. Gunawan, Erik Suanda Handika, and Edi Ismanto, “Pendekatan Machine Learning Dengan Menggunakan Algoritma Xgboost (Extreme Gradient Boosting) Untuk Peningkatan Kinerja Klasifikasi Serangan Syn,” J. Comput. Sci. Inf. Technol., vol. 3, no. 3, pp. 453–463, 2022, doi: 10.37859/coscitech.v3i3.4356.

L. Sari, A. Romadloni, R. Lityaningrum, and H. D. Hastuti, “Implementation of LightGBM and Random Forest in Potential Customer Classification,” TIERS Inf. Technol. J., vol. 4, no. 1, pp. 43–55, 2023, doi: 10.38043/tiers.v4i1.4355.

F. I. Kurniadi and P. D. Larasati, “Light Gradient Boosting Machine untuk Deteksi Penyakit Stroke,” J. SISKOM-KB (Sistem Komput. dan Kecerdasan Buatan), vol. 6, no. 1, pp. 67–72, 2022, doi: 10.47970/siskom-kb.v6i1.328.

D. Normawati and S. A. Prayogi, “Implementasi Naïve Bayes Classifier Dan Confusion Matrix Pada Analisis Sentimen Berbasis Teks Pada Twitter,” J. Sains Komput. Inform., vol. 5, no. 2, pp. 697–711, 2021, doi: 10.30645/j-sakti.v5i2.369.

Y. Afrillia, L. Rosnita, and D. Siska, “Analisis Sentimen Ciutan Twitter Terkait Penerapan Permendikbudristek Nomor 30 Tahun 2021 Menggunakan TextBlob dan Support Vector Machine,” G-Tech J. Teknol. Terap., vol. 6, no. 2, pp. 387–394, 2022, doi: 10.33379/gtech.v6i2.1778.

A Comparative Analysis of Hyperparameter-Tuned XGBoost and LightGBM for Multiclass Rainfall Classification in Jakarta

Authors

DOI:

Keywords:

Abstract

Downloads

References

Additional Files

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

sidebar

Information