Predicting Anxiety of STMIK Palangkaraya Students Using K-Means Clustering and Gaussian Naïve Bayes

Authors

  • Maura Widyaningsih Program Studi Teknik Informatika , STMIK Palangka Raya, Palangka Raya, Indonesia
  • Rosmiati Program Studi Sistem Informasi , STMIK Palangka Raya, Palangka Raya, Indonesia
  • Paholo Iman Prakoso DIKE, Fakultas MIPA, Universitas Gadjah Mada, Yogyakarta, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.1.5259

Keywords:

Gaussian Naïve Bayes, K-Means Clustering, Machine Learning, Student Anxiety, Student Prediction

Abstract

Academic anxiety is a common psychological problem experienced by students, especially before final exams, which impacts learning performance and mental well-being. This study aims to identify and predict students' anxiety levels using a Machine Learning approach, specifically the web framework Gradio, through a combination of the K-Means Clustering and Gaussian Naïve Bayes (GNB) methods. The research instrument used a Google Form-based questionnaire modified from the Zung Self-Rating Anxiety Scale (ZSAS) with 20 items (K1–K20) on a Likert scale (0–3). Data were obtained from 110 students of the Information Systems and Informatics Engineering Study Program at STMIK Palangkaraya. The research process consisted of five main stages: pre-processing, clustering using the K-Means algorithm, training the GNB classification model, evaluation, and prediction of new data. The clustering results categorized the data into three levels of anxiety: Low, Median, and High. The GNB model showed 95% accuracy with a balanced distribution of evaluation metrics (precision, recall, and F1 score). Comparison with other algorithms shows that while SVM achieved the highest accuracy (100%), GNB was more balanced in handling uneven class distributions and more practical for implementation in web-based systems. This prediction system has the potential to be used as an early detection tool for student anxiety, while also supporting educational institutions in designing more targeted psychological interventions. Further improvements can be made by expanding the scope of respondents, balancing the data distribution, and testing other machine learning methods to improve model generalization. The program and data are available at: https://github.com/maurawidya75/StudentAnxiety2025.

Downloads

Download data is not yet available.

References

M. Bieleke, T. Goetz, T. Yanagida, E. Botes, A. C. Frenzel, and R. Pekrun, “Measuring emotions in mathematics: the Achievement Emotions Questionnaire—Mathematics (AEQ-M),” ZDM - Math. Educ., vol. 55, no. 2, 2023, doi: 10.1007/s11858-022-01425-8.

R. P. George, P. M. Donald, H. H. K. Soe, S. C. Tee, J. Toh, and M. J. Q. Cheah, “Prevalence of Symptoms of Depression, Anxiety, and Stress among Undergraduate Dental Students in Malaysia.,” J. Contemp. Dent. Pract., vol. 23, no. 5, pp. 532–538, May 2022, [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/35986462

A. S. Nugroho, R. T. Sari, H. Cahyono, P. Siswanto, and O. Jambari, “Investigating Causing Factors of Speaking Anxiety,” EDUKASIA J. Pendidik. dan Pembelajaran, vol. 4, no. 2, pp. 1289–1294, 2023, [Online]. Available: http://jurnaledukasia.org

Y. Zhenlei, M. Boyuan, S. Lin, G. Chunxia, and H. Qiang, “Identification of knowledge anxiety factors among researchers based on grounded theory,” Heliyon, vol. 10, no. 4, 2024, doi: 10.1016/j.heliyon.2024.e25752.

M. Zineldin, “Neurological and psychological determinants of depression, anxiety, and life quality,” Int. J. Prev. Med., vol. 12, no. 1, 2021, doi: 10.4103/ijpvm.ijpvm_237_19.

E. Zhou et al., “Psychosocial factors associated with anxious depression,” J. Affect. Disord., vol. 322, pp. 39–45, 2023, doi: 10.1016/j.jad.2022.11.028.

K. M. Keyes and J. M. Platt, “Annual Research Review: Sex, gender, and internalizing conditions among adolescents in the 21st century – trends, causes, consequences,” Journal of Child Psychology and Psychiatry and Allied Disciplines, vol. 65, no. 4. 2024. doi: 10.1111/jcpp.13864.

Q. Chen, “Causes and Treatment of Anxiety Disorder,” Lect. Notes Educ. Psychol. Public Media, vol. 9, no. 1, 2023, doi: 10.54254/2753-7048/9/20230230.

L. Luo et al., “Predictors of depression among Chinese college students: a machine learning approach,” BMC Public Health, vol. 25, no. 1, p. 470, Feb. 2025, doi: 10.1186/s12889-025-21632-8.

I. J. Ratul, M. M. Nishat, F. Faisal, S. Sultana, A. Ahmed, and M. A. Al Mamun, “Analyzing Perceived Psychological and Social Stress of University Students: A Machine Learning Approach,” Heliyon, vol. 9, no. 6, 2023, doi: 10.1016/j.heliyon.2023.e17307.

G. Tyulepberdinova, M. Mansurova, T. Sarsembayeva, S. Issabayeva, and D. Issabayeva, “The physical, social, and mental conditions of machine learning in student health evaluation,” J. Comput. Assist. Learn., vol. 40, no. 5, pp. 2020–2030, Oct. 2024, doi: 10.1111/jcal.12999.

S. S. Malik and A. Khan, “Anxiety, Depression and Stress prediction among College Students using Machine Learning Algorithms,” in 2023 2nd International Conference on Electrical, Electronics, Information and Communication Technologies, ICEEICT 2023, 2023, pp. 1–5. doi: 10.1109/ICEEICT56924.2023.10157693.

S. Tribedi, A. Biswas, S. K. Ghosh, and A. Ghosh, “Machine Learning Based Anxiety Prediction of General Public from Tweets During COVID-19,” in Studies in Computational Intelligence, vol. 963, 2022, pp. 291–312. doi: 10.1007/978-3-030-74761-9_13.

F. T. Cruz, E. E. C. Flores, and S. J. C. Quispe, “Prediction of depression status in college students using a Naive Bayes classifier based machine learning model,” Psychol. Comput. Sci., Jul. 2023, [Online]. Available: http://arxiv.org/abs/2307.14371

R. Sahoo, B. K. Mishra, and B. R. Das, “Odia Text Classification Using Naïve Bayes Algorithm : An Empirical Study,” ECS Trans., vol. 107, no. 1, 2022, doi: 10.1149/10701.8175ecst.

U. Madububambachu, A. Ukpebor, and U. Ihezue, “Machine Learning Techniques to Predict Mental Health Diagnoses: A Systematic Literature Review,” Clin. Pract. Epidemiol. Ment. Heal., vol. 20, no. 1, Jul. 2024, doi: 10.2174/0117450179315688240607052117.

I. Kaur, Kamini, J. Kaur, Gagandeep, S. P. Singh, and U. Gupta, “Enhancing explainability in predicting mental health disorders using human–machine interaction,” Multimed. Tools Appl., 2024, doi: 10.1007/s11042-024-18346-1.

D. Goutam, V. Rani, and H. S. i Sain, “Mental Health illness Prediction with Hybrid Machine Learning Approach,” Int. Res. J. Mod. Eng. Technol. Sci., vol. 06, no. 01, pp. 3893–3898, 2024, doi: 10.56726/irjmets49045.

K. Vaishnavi, U. N. Kamath, B. A. Rao, and N. V. S. Reddy, “Predicting Mental Health Illness using Machine Learning Algorithms,” in Journal of Physics: Conference Series, 2022. doi: 10.1088/1742-6596/2161/1/012021.

D. A. Dunstan and N. Scott, “Norms for Zung’s Self-rating Anxiety Scale,” BMC Psychiatry, vol. 20, no. 1, 2020, doi: 10.1186/s12888-019-2427-6.

J. H. Li et al., “Comparison of the effects of imputation methods for missing data in predictive modelling of cohort study datasets,” BMC Med. Res. Methodol., vol. 24, no. 1, 2024, doi: 10.1186/s12874-024-02173-x.

C. Wongoutong, “The impact of neglecting feature scaling in k-means clustering,” PLoS One, vol. 19, no. 12, p. e0310839, Dec. 2024, doi: 10.1371/journal.pone.0310839.

M. Koo and S.-W. Yang, “Likert-Type Scale,” Encyclopedia, vol. 5, no. 1, p. 18, Feb. 2025, doi: 10.3390/encyclopedia5010018.

J. Mumu, B. Tanujaya, R. Charitas, and I. Prahmana, “Likert Scale in Social Sciences Research: Problems and Difficulties,” FWU J. Soc. Sci., vol. 16, no. 4, 2022, doi: 10.51709/19951272/Winter2022/7.

A. F. Kiliç, I. Uysal, and B. Kalkan, “An alternative to likert scale: Emoji,” Journal of Measurement and Evaluation in Education and Psychology, vol. 12, no. 2. 2021. doi: 10.21031/epod.864336.

M. Liu et al., “Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques,” Artificial Intelligence in Medicine, vol. 142. pp. 178–210, 2023. doi: 10.1016/j.artmed.2023.102587.

K. Kotan and S. Kırışoğlu, “Cyclical hybrid imputation technique for missing values in data sets,” Sci. Rep., vol. 15, no. 1, p. 6543, Feb. 2025, doi: 10.1038/s41598-025-90964-7.

A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Inf. Sci. (Ny)., vol. 622, pp. 178–210, 2023, doi: 10.1016/j.ins.2022.11.139.

R. Mussabayev, N. Mladenovic, B. Jarboui, and R. Mussabayev, “How to Use K-means for Big Data Clustering?,” Pattern Recognit., vol. 137, p. 109269., 2023, doi: 10.1016/j.patcog.2022.109269.

X. Hu, X. Chen, W. Liu, and G. Dai, “Road Traffic Status Prediction Approach Based on Kmeans-Decision Tree Model,” J. Eng. Proj. Prod. Manag., vol. 12, no. 2, 2022, doi: 10.32738/JEPPM-2022-0010.

Nurul Rismayanti and Aulia Putri Utami, “Improving Multi-Class Classification on 5-Celebrity-Faces Dataset using Ensemble Classification Methods,” Indones. J. Data Sci., vol. 4, no. 2, 2023, doi: 10.56705/ijodas.v4i2.78.

K. Maswadi, N. A. Ghani, S. Hamid, and M. B. Rasheed, “Human activity classification using Decision Tree and Naïve Bayes classifiers,” Multimed. Tools Appl., vol. 80, no. 14, 2021, doi: 10.1007/s11042-020-10447-x.

M. V. Anand, B. Kiranbala, S. R. Srividhya, K. C., M. Younus, and M. H. Rahman, “Gaussian Naïve Bayes Algorithm: A Reliable Technique Involved in the Assortment of the Segregation in Cancer,” Mob. Inf. Syst., vol. 2022, no. 436946, pp. 1–7, 2022, doi: 10.1155/2022/2436946.

S. M. Piryonesi and T. E. El-Diraby, “Data Analytics in Asset Management: Cost-Effective Prediction of the Pavement Condition Index,” J. Infrastruct. Syst., vol. 26, no. 1, 2020, doi: 10.1061/(asce)is.1943-555x.0000512.

M. Fahmy Amin, “Confusion Matrix in Binary Classification Problems: A Step-by-Step Tutorial,” J. Eng. Res., vol. 6, no. 5, 2022, doi: 10.21608/erjeng.2022.274526.

Herman, H. Darwis, Nurfauziyah, R. Puspitasari, D. Widyawati, and A. Faradibah, “Comparative Analysis of Anxiety Disorder Classification Using Algorithm Naïve Bayes, Decision Tree and K-NN,” in 2025 19th International Conference on Ubiquitous Information Management and Communication (IMCOM), IEEE, Jan. 2025, pp. 1–6. doi: 10.1109/IMCOM64595.2025.10857485.

T. Wang, C. Xue, Z. Zhang, T. Cheng, and G. Yang, “Unraveling the distinction between depression and anxiety: A machine learning exploration of causal relationships,” Comput. Biol. Med., vol. 174, p. 108446, May 2024, doi: 10.1016/j.compbiomed.2024.108446.

Additional Files

Published

2026-02-15

How to Cite

[1]
M. Widyaningsih, R. Rosmiati, and P. I. Prakoso, “Predicting Anxiety of STMIK Palangkaraya Students Using K-Means Clustering and Gaussian Naïve Bayes”, J. Tek. Inform. (JUTIF), vol. 7, no. 1, pp. 169–184, Feb. 2026.