CLASSIFICATION OF REGIONAL LANGUAGES USING METHODS GRADIENT BOOTS AND RANDOM FOREST
Abstract
Indonesia is one of the countries that has the most regional languages in the world, ranking second most. The large number of regional languages that are owned makes it difficult for people between regions to recognize the origins of the regional language, so the author aims to conduct research by identifying a regional language. Identifying a language using data mining, one of the data mining techniques is classification. Classification is a technique used to find the value of data. Classification will build a model from samples of data into groups of the same type. There are two classification methods used in this research, namely gradient boots and random forest, where the two methods will be compared using regional language data from Java, Nias and Toraja. The results of calculating the accuracy values for the two methods used are quite good in classifying languages with results of an accuracy level of 0.8 or 80%, where the results of the gradient boots research have an accuracy value of 0.8850 or 88.5%, while the random forest method has an accuracy value. random forest is lower, namely 0.8794 or 87.94%, so in this study the gradient boots method is the best method.
Downloads
References
N. Maghfiroh, “Bahasa Indonesia Sebagai Alat Komunikasi Masyarakat Dalam Kehidupan Sehari-Hari,” J. Ilm. Ilmu Komun., vol. 19, no. 2, pp. 102–107, 2022.
A. D. Azis, “Bugis Language Maintenance Strategy in Lombok,” J. Pendidik. Bhs. dan Sastra Indones., vol. 3, no. 2, pp. 199–208, 2020.
M. Windarti and A. Suradi, “Perbandingan Kinerja 6 Algoritme Klasifikasi Data Mining untuk Prediksi Masa Studi Mahasiswa,” Telematika, vol. 1, no. 1, pp. 14–30, 2019.
S. Lutfiani, T. H. Saragih, F. Abadi, M. R. Faisal, and K. Dwi, “Perbandingan Metode Extreme Gradient Boosting dan Metode Decision Tree Untuk Klasifikasi Genre Musik,” JIP (Jurnal Inform. Polinema), vol. 9, no. 4, pp. 373–382, 2023.
R. Leonardo and J. Pratama, “Perbandingan Metode Random Forest Dan NaïveBayes Dalam Prediksi Keberhasilan Klien Telemarketing,” J. Penelit. Tek. Inform., vol. 3, no. 2, pp. 455–459, 2020.
N. Katriani and E. Mailoa, “Klasifikasi Bahasa Daerah Menggunakan Decision Tree Dan Gradient Boots,” J. Tek. Inform. dan Sist. Inf., vol. 9, no. 2, pp. 930–940, 2022.
W. Apriliah, I. Kurniawan, M. Baydhowi, and T. Haryati, “Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest,” J. Sist. Inf., vol. 10, no. 1, pp. 163–171, 2021.
M. Syukron, R. Santoso, and T. Widiharih, “Perbandingan Metode Smote Random Forest Dan Smote Xgboost Untuk Klasifikasi Tingkat Penyakit Hepatitis C Pada Imbalance Class Data,” J. Gaussian, vol. 9, no. 3, pp. 227–236, 2020.
L. Lumbaa, E. Mailoa, and ..., “Implementasi Metode SVM Dan Gradiant Boost Dalam Klasifikasi Bahasa Daerah (Halmahera, Kalimantan, Toraja),” J. Tek. Inform. dan Sist. Inf., vol. 9, no. 2, pp. 908–915, 2022.
G. A. Mursianto, I. M. Falih, M. Irfan, T. Sakinah, and D. S. Prasvita, “Perbandingan Metode Klasifikasi Random Forest dan XGBoost Serta Implementasi Teknik SMOTE pada Kasus Prediksi Hujan,” J. Senamika, vol. 2, no. 2, pp. 41–50, 2021.
E. Firasari, U. Khultsum, M. N. Winnarto, and R. Risnandar, “Kombinasi K-NN dan Gradient Boosted Trees untuk Klasifikasi Penerima Program Bantuan Sosial,” J. Teknol. Inf. dan Ilmu Komput., vol. 7, no. 6, pp. 1231--1236, 2020.
Yoga Religia, Agung Nugroho, and Wahyu Hadikristanto, “Klasifikasi Analisis Perbandingan Algoritma Optimasi pada Random Forest untuk Klasifikasi Data Bank Marketing,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 187–192, 2021.
E. Renata and M. Ayub, “Penerapan Metode Random forest untuk Analisis Risiko pada dataset Peer to peer lending,” J. Tek. Inform. dan Sist. Inf., vol. 6, no. 3, pp. 462–474, 2020.
N. Nuraeni, “Klasifikasi Data Mining untuk Prediksi Potensi Nasabah dalam Membuat Deposito Berjangka,” J. Ilm. Intech Inf. Technol. J. UMUS, vol. 3, no. 01, pp. 65–75, 2021.
A. C. Nugraha and M. I. Irawan, “Komparasi Deteksi Kecurangan pada Data Klaim Asuransi Pelayanan Kesehatan Menggunakan Metode Support Vector Machine (SVM) dan Extreme Gradient Boosting (XGBoost),” J. Sains dan Seni ITS, vol. 12, no. 1, pp. 40–46, 2023.
S. E. Herni Yulianti, Oni Soesanto, and Yuana Sukmawaty, “Penerapan Metode Extreme Gradient Boosting (XGBOOST) pada Klasifikasi Nasabah Kartu Kredit,” J. Math. Theory Appl., vol. 4, no. 1, pp. 21–26, 2022.
G. M. Momole, “Perbandingan Naïve Bayes dan Random Forest Dalam Klasifikasi Bahasa Daerah,” J. Tek. Inform. dan Sist. Inf., vol. 9, no. 2, pp. 855–863, 2022.
Copyright (c) 2023 Eva Sapan Patasik, Sri Yulianto
This work is licensed under a Creative Commons Attribution 4.0 International License.