Enhancement Of The C4.5 Decision Tree Algorithm With Anova For Predicting Academic Achievement Of Students At Smpn.16 Kota Jambi

Authors

  • Rice Osviarni Management Information Systems, Universitas Dinamika Bangsa, Jambi, Indonesia
  • Setiawan Assegaff Management Information Systems, Universitas Dinamika Bangsa, Jambi, Indonesia
  • Jasmir Management Information Systems, Universitas Dinamika Bangsa, Jambi, Indonesia
  • Nurhadi Management Information Systems, Universitas Dinamika Bangsa, Jambi, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.2.5431

Keywords:

ANOVA, Data Mining, Decision Tree C4.5, Prediction Of Academic Achievement, SMPN 16 Kota Jambi

Abstract

This study aims to improve the accuracy of predicting student academic achievement by integrating the Analysis of Variance (ANOVA) method with the C4.5 Decision Tree algorithm. In the context of information systems, this research holds significant importance for the development of more reliable Decision Support Systems (DSS) or early warning systems in school environments. The research was conducted at SMPN 16 Jambi City using secondary data from three academic years (2022/2023-2024/2025) covering academic variables, attendance, and parental income. The main issue addressed was the limitations of the C4.5 algorithm in handling irrelevant features and unbalanced data, which, at the system implementation level, can lead to inaccurate recommendations or alerts.This research method employed a data mining approach with stages including data cleaning, numeric conversion, missing value imputation, formation of derived variables, and categorization of the target variable "Achievement." The initial C4.5 model produced 72.81% accuracy on the training data and 69.71% accuracy on cross-validation. After feature selection using ANOVA, one insignificant variable was removed, resulting in a hybrid C4.5+ANOVA model with nine key features. Test results showed an increase in accuracy to 80.44% on the training data and 73.66% on the cross-validation data, representing an improvement of 7.63 and 3.95 percentage points, respectively.This improvement in model performance directly translates to an enhancement in the quality of the information system's output, yielding more reliable reports and predictions for teachers and school management.

Downloads

Download data is not yet available.

References

M. Arifin, Widowati, Farikhin, A. Wibowo, and B. Warsito, “Comparative Analysis on Educational Data Mining Algorithm to Predict Academic Performance,” in 2021 International Seminar on Application for Technology of Information and Communication (iSemantic), Semarangin, Indonesia: IEEE, Sept. 2021, pp. 173–178. doi: 10.1109/iSemantic52711.2021.9573185.

Department of Computer Science and Engineering, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India; IEEE Senior Member, Symbiosis Institute of Technology, Pune, India, and A. Sharma, “Predicting Student Performance Using Educational Data Mining and Learning Analytics Technique,” J. Intell. Syst. Internet Things, vol. 10, no. 2, pp. 24–37, 2023, doi: 10.54216/JISIoT.100203.

C. Cui et al., “Tri-Branch Convolutional Neural Networks for Top-$k$ Focused Academic Performance Prediction,” July 22, 2021, arXiv: arXiv:2107.10424. doi: 10.48550/arXiv.2107.10424.

A. Amrein-Beardsley and J. Holloway, “Value-Added Models for Teacher Evaluation and Accountability: Commonsense Assumptions,” Educ. Policy, vol. 33, no. 3, pp. 516–542, May 2019, doi: 10.1177/0895904817719519.

V. Emslander, J. Levy, R. Scherer, and A. Fischbach, “Value-added scores show limited stability over time in primary school,” PLOS ONE, vol. 17, no. 12, p. e0279255, Dec. 2022, doi: 10.1371/journal.pone.0279255.

J. Levy, M. Brunner, U. Keller, and A. Fischbach, “Methodological issues in value-added modeling: an international review from 26 countries,” Educ. Assess. Eval. Account., vol. 31, no. 3, pp. 257–287, Aug. 2019, doi: 10.1007/s11092-019-09303-w.

J. Liu, P. Peng, B. Zhao, and L. Luo, “Socioeconomic Status and Academic Achievement in Primary and Secondary Education: a Meta-analytic Review,” Educ. Psychol. Rev., vol. 34, no. 4, pp. 2867–2896, Dec. 2022, doi: 10.1007/s10648-022-09689-y.

X. Wang, M. Dai, and R. Mathis, “The influences of student- and school-level factors on engineering undergraduate student success outcomes: A multi-level multi-school study,” Int. J. STEM Educ., vol. 9, no. 1, p. 23, Dec. 2022, doi: 10.1186/s40594-022-00338-y.

Y. Jang, S. Choi, H. Jung, and H. Kim, “Practical early prediction of students’ performance using machine learning and eXplainable AI,” Educ. Inf. Technol., vol. 27, no. 9, pp. 12855–12889, Nov. 2022, doi: 10.1007/s10639-022-11120-6.

W. Lastari, “Penerapan Data Mining Untuk Memprediksi Prestasi Siswa SMA Pada Dinas Pendidikan Provinsi Jambi,” vol. 8, 2023, [Online]. Available: https://ejournal.unama.ac.id/index.php/jurnalmsi/article/view/864

J. E. Ibarra-Esquer, B. L. Flores-Rios, M. A. Astorga-Vargas, A. C. Justo-Lopez, and G. E. Chavez-Valenzuela, “A Data-centric Approach to Tracking Student Academic Performance and Progression,” IAENG International Journal of Computer Science, vol. 51, no. 12, pp. 1968–1979, 2024.

M. Priyadharshini, S. Indra, S. Achuthan, and K. Lokesh, “Predicting Student Success: A Comparative Examination of Machine Learning Techniques,” Indian J. Comput. Sci. Technol., pp. 213–217, July 2024, doi: 10.59256/indjcst.20240302031.

J. Ranellucci, N. C. Hall, K. R. Muis, S. P. Lajoie, and K. A. Robinson, “Mastery, Maladaptive Learning Behaviour, and Academic Achievement: An Intervention Approach,” 2017.

S. A. A. Kharis and A. H. A. Zili, “Learning Analytics dan Educational Data Mining pada Data Pendidikan,” J. Ris. PEMBELAJARAN Mat. Sekol., vol. 6, no. 1, pp. 12–20, Mar. 2022, doi: 10.21009/jrpms.061.02.

T. Rahmat, “Pengaruh Kehadiran Siswa Terhadap Hasil belajar Matematika Kelas VIII MTsN 11 Agam Tahun Pelajaran 2021/2022”.

J. López-Zambrano, J. Lara Torralbo, and C. Romero, “Early Prediction of Student Learning Performance Through Data Mining: A Systematic Review,” Psicothema, vol. 3, no. 33, pp. 456–465, Aug. 2021, doi: 10.7334/psicothema2021.62.

L. Al-Alawi, J. Al Shaqsi, A. Tarhini, and A. S. Al-Busaidi, “Using machine learning to predict factors affecting academic performance: the case of college students on academic probation,” Educ. Inf. Technol., vol. 28, no. 10, pp. 12407–12432, Oct. 2023, doi: 10.1007/s10639-023-11700-0.

M. Bellaj, A. Ben Dahmane, S. Boudra, and M. Lamarti Sefian, “Educational Data Mining: Employing Machine Learning Techniques and Hyperparameter Optimization to Improve Students’ Academic Performance,” Int. J. Online Biomed. Eng. IJOE, vol. 20, no. 03, pp. 55–74, Feb. 2024, doi: 10.3991/ijoe.v20i03.46287.

P. M. Lyman and A. E. Olvido, “Exploring Variation in Student Academic Performance: Can Achievement in an Immersive Case Study Project Predict Exam Score in an Introductory Accounting Course?,” J. Scholarsh. Teach. Learn., vol. 20, no. 2, Oct. 2020, doi: 10.14434/josotl.v20i2.27648.

M. Bellaj, A. Ben Dahmane, S. Boudra, and M. Lamarti Sefian, “Educational Data Mining: Employing Machine Learning Techniques and Hyperparameter Optimization to Improve Students’ Academic Performance,” Int. J. Online Biomed. Eng. IJOE, vol. 20, no. 03, pp. 55–74, Feb. 2024, doi: 10.3991/ijoe.v20i03.46287.

S. Lee, C. Lee, K. G. Mun, and D. Kim, “Decision Tree Algorithm Considering Distances Between Classes,” IEEE Access, vol. 10, pp. 69750–69756, 2022, doi: 10.1109/access.2022.3187172.

V. Morosanova, T. Fomina, and I. Bondarenko, “Academic achievement: Intelligence, regulatory, and cognitive predictors,” Psychol. Russ., no. Query date: 2025-05-02 08:34:19, 2015, [Online]. Available: https://cyberleninka.ru/article/n/academic-achievement-intelligence-regulatory-and-cognitive-predictors

I. S. Damanik, A. P. Windarto, A. Wanto, Poningsih, S. R. Andani, and W. Saputra, “Decision Tree Optimization in C4.5 Algorithm Using Genetic Algorithm,” J. Phys. Conf. Ser., vol. 1255, no. 1, p. 012012, Aug. 2019, doi: 10.1088/1742-6596/1255/1/012012.

“ANALISIS PENERAPAN METODE ONE WAY ANOVA MENGGUNAKAN ALAT STATISTIK SPSS,” J. Ris. Akunt. Soedirman, 2023, doi: 10.32424/1.jras.2023.2.2.10815.

L. Akbay, T. Akbay, O. Erol, and M. Kilinç, “Inadvertent Use of ANOVA in Educational Research: ANOVA is not A Surrogate for MANOVA,” Eğitimde Ve Psikolojide Ölçme Ve Değerlendirme Derg., vol. 10, no. 3, pp. 302–314, Sept. 2019, doi: 10.21031/epod.524511.

M. L. Mouritsen, J. T. Davis, and S. C. Jones, “ANOVA Analysis of Student Daily Test Scores in Multi-Day Test Periods.,” Journal of Learning in Higher Education.

A. Huday and Zaehol Fatah, “PENERAPAN DECISION TREE C4.5 DALAM MEMPREDIKSI PREDIKAT TERBAIK DI MADRASAH TA’HILIYAH IBRAHIMY,” J. Ilm. Multidisiplin Ilmu, vol. 2, no. 1, pp. 61–68, Feb. 2025, doi: 10.69714/be4q6n31.

H. Rifa’i, Ryan Hamonangan, Dian Ade Kurnia, Kaslani, and Mulyawan, “Implementasi Algoritma Decision Tree Dalam Klasifikasi Kompetensi Siswa,” KOPERTIP J. Ilm. Manaj. Inform. Dan Komput., vol. 6, no. 1, pp. 15–20, June 2022, doi: 10.32485/kopertip.v6i1.131.

M. S. Jailani and D. A. Saksitha, “TEHNIK ANALISIS DATA KUANTITATIF DAN KUALITATIF DALAM PENELITIAN ILMIAH,” vol. Volume 15, Number 2, 2024 pp., pp. 79–91.

S. Batool, J. Rashid, M. W. Nisar, J. Kim, H.-Y. Kwon, and A. Hussain, “Educational data mining to predict students’ academic performance: A survey study,” Educ. Inf. Technol., vol. 28, no. 1, pp. 905–971, Jan. 2023, doi: 10.1007/s10639-022-11152-y.

N. Abdillah and F. Yuniko, “Performance analysis of data mining classification methods using c4.5 algorithm for student graduation prediction (case study at syedza saintika stikes),” PUBLIC Health.

J. H. Yam and R. Taufik, “Hipotesis Penelitian Kuantitatif,” Perspekt. J. Ilmu Adm., vol. 3, no. 2, pp. 96–102, Aug. 2021, doi: 10.33592/perspektif.v3i2.1540.

C. Bentéjac, “A comparative analysis of gradient boosting algorithms,” Artif. Intell. Rev., vol. 54, no. 3, pp. 1937–1967, 2021, doi: 10.1007/s10462-020-09896-5.

K. Anam, B. Nurhakim, and C. Juliane, “Komparasi Algoritma Klasifikasi Data Mining Menggunakan Optimize Selection untuk Peminatan Program Studi,” Build. Inform. Technol. Sci. BITS, vol. 4, no. 2, pp. 606–613, Sept. 2022, doi: 10.47065/bits.v4i2.2160.

K. M. Unertl, L. L. Novak, K. B. Johnson, and N. M. Lorenzi, “Traversing the many paths of workflow research: developing a conceptual framework of workflow terminology through a systematic literature review,” J. Am. Med. Inform. Assoc., vol. 17, no. 3, pp. 265–273, May 2010, doi: 10.1136/jamia.2010.004333.

M. Faridl, “PROGRAM STUDI MAGISTER PSIKOLOGI SAINS DIREKTORAT PROGRAM PASCASARJANA UNIVERSITAS MUHAMMADIYAH MALANG”.

A. D. Madden, “A review of basic research tools without the confusing philosophy,” High. Educ. Res. Dev., vol. 41, no. 5, pp. 1633–1647, July 2022, doi: 10.1080/07294360.2021.1920895.

Additional Files

Published

2026-04-15

How to Cite

[1]
R. Osviarni, S. Assegaff, J. Jasmir, and N. Nurhadi, “Enhancement Of The C4.5 Decision Tree Algorithm With Anova For Predicting Academic Achievement Of Students At Smpn.16 Kota Jambi”, J. Tek. Inform. (JUTIF), vol. 7, no. 2, pp. 1116–1126, Apr. 2026.

Most read articles by the same author(s)