Comparative Analysis of the Performance of Random Forest and CatBoost for Air Quality Prediction Based on Meteorological Factor

Authors

  • Nirsal Informatika, Universitas Cokroaminoto Palopo, Indonesia
  • Nurchaerani Kadir Informatika, Universitas Cokroaminoto Palopo, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.3.5412

Keywords:

Air Quality, Meteorological Variables, Random Forest, CatBoost, Machine Learning

Abstract

Air quality in urban centers such as Tangerang City has become an increasingly urgent issue due to the expansion of industrial activities, rapid population growth, and rising vehicle emissions. As a key city within the Greater Jakarta metropolitan area, Tangerang is highly vulnerable to air pollution caused by human activities and varying meteorological conditions. This study aims to assess the performance of two machine learning algorithms, Random Forest and CatBoost, in predicting air quality in Tangerang under two scenarios: models that incorporate meteorological factors and models that exclude them. The dataset includes concentrations of key air pollutants alongside meteorological variables such as temperature, humidity, and wind speed. Model performance was evaluated using MAE, MSE, RMSE, and R². The findings indicate that both algorithms perform excellently when meteorological variables are included. Random Forest achieved an MAE of 0.0099, MSE of 0.000309, RMSE of 0.0152, and an R² of 0.9931, slightly outperforming CatBoost, which recorded an MAE of 0.0135, MSE of 0.000419, RMSE of 0.0170, and an R² of 0.9907. Excluding meteorological variables decreased accuracy for both models, with Random Forest reaching an R² of 0.9519 and CatBoost 0.9487. These results underscore the importance of temperature, humidity, and wind speed in enhancing predictive accuracy. Notably, this study introduces a comparative evaluation of machine learning models in a unique urban context, providing new insights into how meteorological factors influence air quality predictions. The study contributes to the development of adaptive air quality prediction models, supporting sustainable environmental management planning in Tangerang City.

Downloads

Download data is not yet available.

References

A. H. Khoshakhlagh, M. Mohammadzadeh, A. Gruszecka-Kosowska, and E. Oikonomou, “Burden of cardiovascular disease attributed to air pollution: a systematic review,” Global. Health, vol. 20, no. 1, pp. 1–24, 2024, doi: 10.1186/s12992-024-01040-0.

N. S. Gupta, Y. Mohta, K. Heda, R. Armaan, B. Valarmathi, and G. Arulkumaran, “Prediction of Air Quality Index Using Machine Learning Techniques: A Comparative Analysis,” J. Environ. Public Health, vol. 2023, pp. 1–26, 2023, doi: 10.1155/2023/4916267.

Z. Zhang, S. Zhang, C. Chen, and J. Yuan, “A systematic survey of air quality prediction based on deep learning,” Alexandria Eng. J., vol. 93, pp. 128–141, 2024, doi: 10.1016/j.aej.2024.03.031.

World Health Organization (WHO), “Polusi udara ambien (luar ruangan),” 2024. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-and-health

G. Ravindiran et al., “Impact of air pollutants on climate change and prediction of air quality index using machine learning models,” Environ. Res., vol. 239, p. 117354, 2023, doi: https://doi.org/10.1016/j.envres.2023.117354.

Afifa, K. Arshad, N. Hussain, M. H. Ashraf, and M. Z. Saleem, “Air pollution and climate change as grand challenges to sustainability,” Sci. Total Environ., vol. 928, p. 172370, 2024, doi: 10.1016/j.scitotenv.2024.172370.

Y. Özüpak, F. Alpsalaz, and E. Aslan, “Air Quality Forecasting Using Machine Learning: Comparative Analysis and Ensemble Strategies for Enhanced Prediction,” Water. Air. Soil Pollut., vol. 236, no. 7, pp. 1–17, 2025, doi: 10.1007/s11270-025-08122-8.

H. Chen, G. Deng, and Y. Liu, “Monitoring the Influence of Industrialization and Urbanization on Spatiotemporal Variations of AQI and PM2.5 in Three Provinces, China,” Atmosphere (Basel)., vol. 13, no. 9, 2022, doi: 10.3390/atmos13091377.

S. G. Bontong, D. A. Permadi, and P. Benjamin, “Determination of Air Quality Protection and Management Strategic Area : Case Study of Tangerang City,” J. Presipitasi Media Komun. dan Pengemb. Tek. Lingkung., vol. 21, no. 3, pp. 852–868, 2024, doi: 10.14710/presipitasi.v21i3.852-868.

Y. Liu, P. Wang, Y. Li, L. Wen, and X. Deng, “Air quality prediction models based on meteorological factors and real-time data of industrial waste gas,” Sci. Rep., vol. 12, no. 1, pp. 1–15, 2022, doi: 10.1038/s41598-022-13579-2.

C. Girotti et al., “Air pollution Dynamics: The role of meteorological factors in PM10 concentration patterns across urban areas,” City Environ. Interact., vol. 25, 2024, doi: 10.1016/j.cacint.2024.100184.

R. Liu et al., “Air Quality—Meteorology Correlation Modeling Using Random Forest and Neural Network,” Sustain., vol. 15, no. 5, 2023, doi: 10.3390/su15054531.

X. Que, “Analysis of the Influence of Meteorological Factors on Air Pollutants in Nanning from 2018 to 2020,” Highlights Sci. Eng. Technol., vol. 9, pp. 148–155, 2022, doi: 10.54097/hset.v9i.1734.

X. Tian et al., “Research on Air Quality in Response to Meteorological Factors Based on the Informer Model,” Sustainability, vol. 16, no. 16, 2024, doi: 10.3390/su16166794.

M. T. Udristioiu, Y. EL Mghouchi, and H. Yildizhan, “Prediction, modelling, and forecasting of PM and AQI using hybrid machine learning,” J. Clean. Prod., vol. 421, p. 138496, 2023, doi: https://doi.org/10.1016/j.jclepro.2023.138496.

L. Gao, C. Cai, and X.-M. Hu, “Air Quality Prediction Using Machine Learning,” in Machine Learning in Chemical Safety and Health, 2022, pp. 267–288. doi: https://doi.org/10.1002/9781119817512.ch11.

G. Ravindiran, G. Hayder, K. Kanagarathinam, A. Alagumalai, and C. Sonne, “Air quality prediction by machine learning models: A predictive study on the indian coastal city of Visakhapatnam,” Chemosphere, vol. 338, no. May, 2023, doi: 10.1016/j.chemosphere.2023.139518.

R. Fang, S. Collingwood, Y. Zhang, J. B. Stanford, C. Porucznik, and D. Sleeth, “Optimizing Air Quality Monitoring: Comparative Analysis of Linear Regression and Machine Learning in Low-Cost Sensor Calibration,” Aerosol Air Qual. Res., vol. 25, no. 1, pp. 1–17, 2025, doi: 10.1007/s44408-025-00009-x.

M. Mihirani, L. Yasakethu, and S. Balasooriya, “Machine Learning-based Air Pollution Prediction Model,” 2023 IEEE IAS Glob. Conf. Emerg. Technol. GlobConET 2023, no. 2, pp. 1–6, 2023, doi: 10.1109/GlobConET56651.2023.10150203.

S. Li, X. Deng, and B. Tang, “Using Machine Learning Methods for Prediction of Air Quality in Wuling Mountain Area in China,” in 2021 International Conference on Electronic Information Technology and Smart Agriculture (ICEITSA), 2021, pp. 426–430. doi: 10.1109/ICEITSA54226.2021.00087.

R. E. Saputro and G. Karyono, “Comparative Analysis of Decision Tree , Random Forest , Svm , and Neural Network Models for Predicting Earthquake Magnitude,” vol. 6, no. 2, pp. 755–774, 2025.

I. E. Agbehadji and I. C. Obagbuwa, “Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction,” Atmosphere (Basel)., vol. 15, no. 11, 2024, doi: 10.3390/atmos15111352.

L. Mampitiya et al., “Machine Learning Techniques to Predict the Air Quality Using Meteorological Data in Two Urban Areas in Sri Lanka,” Environ. - MDPI, vol. 10, no. 8, pp. 1–18, 2023, doi: 10.3390/environments10080141.

N. Cholianawati et al., “Diurnal and Daily Variations of PM2.5 and its Multiple-Wavelet Coherence with Meteorological Variables in Indonesia,” Aerosol Air Qual. Res., vol. 24, no. 3, pp. 1–18, 2024, doi: 10.4209/aaqr.230158.

M. Madhuri, G. H. Samyama Gunjal, and S. Kamalapurkar, “Air pollution prediction using machine learning supervised learning approach,” Int. J. Sci. Technol. Res., vol. 9, no. 4, pp. 118–123, 2020.

C. M. Ellis, The Orange Book of Machine Learning: The essentials of making predictions using supervised regression and classification for tabular data. 2024.

D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation,” PeerJ Comput. Sci., vol. 7, pp. 1–24, 2021, doi: 10.7717/PEERJ-CS.623.

T. Wang et al., “Prediction of the Impact of Meteorological Conditions on Air Quality during the 2022 Beijing Winter Olympics,” 2022. doi: 10.3390/su14084574.

R. Janarthanan, P. Partheeban, K. Somasundaram, and P. Navin Elamparithi, “A deep learning approach for prediction of air quality index in a metropolitan city,” Sustain. Cities Soc., vol. 67, 2021, doi: 10.1016/j.scs.2021.102720.

Additional Files

Published

2026-06-15

How to Cite

[1]
N. Nirsal and N. Kadir, “Comparative Analysis of the Performance of Random Forest and CatBoost for Air Quality Prediction Based on Meteorological Factor”, J. Tek. Inform. (JUTIF), vol. 7, no. 3, pp. 2154–2164, Jun. 2026.