COMPARISON OF LOGISTIC REGRESSION, MULTINOMIALNB, SVM, AND K-NN METHODS ON SENTIMENT ANALYSIS OF GOJEK APP REVIEWS ON THE GOOGLE PLAY STORE
Abstract
Today's community activities cannot be separated from the name of transportation because it makes it very easy for people's daily activities. With the existence of transportation, people can more quickly reach their destination. With the Gojek application on the Google Play Store, it will help people travel or deliver goods. To determine service quality, sentiment analysis can be used to classify reviews. The purpose of this study is to compare which method has the best accuracy, so that it can classify reviews into positive or negative sentiments. The methods tested in this study are Logistic Regression, MultinomialNB, SVM, and K-NN. Performance assessment methods include score accuracy, recall, and precision, classification reports, and confusion matrix to determine the appropriate method for classifying reviews into positive or negative categories. Of the four methods tested, the one with the highest performance is the Logistic Regression method. Accuracy, recall and precision scores of the Logistic Regression method were 82.45%, 82.49%, 82.45% and 82.43%, respectively. Classification report also shows good results. In the confusion matrix, there are 111 and 124 True positives and True negatives. There are only 22 and 28 False positive and False negative results respectively. The method that has the lowest score is K-NN, with score accuracy, recall, and precision respectively were 52.28%, 59.43%, 93, 52%, and 65.65%. Classification report shows quite bad results. In the connection matrix, it produces True positives and True negatives 130 and 19. There are only 127 and 9 False positive and False negative results respectively. The results of this study state that using the Logistic Regression method is suitable for use in classifying positive and negative reviews in the review dataset on the Gojek application on the Google Play Store.
Downloads
References
A. A. d. P. R. Adawia, "Analisis Perkembangan Industri Transportasi Online di Era Inovasi Disruptif (Studi Kasus PT Gojek Indonesia)", Cakrawala-Jurnal Humaniora, vol. 18, no. 2, p. 8, 2018.
M. N. M. d. I. Kharisudin, "Analisis Sentimen Aplikasi Gojek Menggunakan Support Vector Machine Dan K Nearest NEIGHBOR", UNNES Journal of Mathematics , vol. 2, no. 10, p. 23, 2021.
F. FANANI, "Klasifikasi Review Software Pada Google Play Menggunakan Pendekatan Analisis Sentimen", Program Studi Teknologi Informasi Fakultas Teknik UGM Yogyakarta, 2017.
M. Alwi. "Classification of Book Based on DDC 23 Using Text Mining Algorithm". UNIDA Gontor Library, 2021
M. I. Putri “Analisis Sentimen Pengguna Aplikasi Marketplace Tokopedia Pada Situs Google Play Menggunakan Metode Support Vector Machine(SVM), Naïve Bayes, dan Logistic Regression”, 2022.
A. Shiddicky, S Agustian, “Analisis Sentimen Masyarakat Terhadap Kebijakan Vaksinasi Covid-19 Pada Media Sosial Twitter Menggunakan Metode Logistic Regression”, Teknik informatika, Sains dan Teknologi, Universitas Islam Negeri Sultan Syarif Kasim Riau, 2022.
E. L. W. Ningrum, dan A. P. Widodo, "Implementasi Metode Multinomial Naïve Bayes Classifier Untuk Analisis Sentimen", Journal of Fundamental Mathematics and Applications (JFMA), vol. 1, no. 2, pp. 128-137, Nov. 2018.
D. Darwis, E. S. Pratiwi, A. F. O. Pasaribu, "Penerapan Algoritma Svm untuk Analisis Sentimen pada Data Twitter Komisi Pemberantasan Korupsi Republik Indonesia", Jurnal Ilmiah Editor/Vol.7, No.1, November 2020.
M. Furqan, S. Sriani, S. M. Sari, "Analisis Sentimen Menggunakan K-Nearest Neighbor Terhadap New Normal Masa Covid-19 Di Indonesia", Jurnal Teknologi Informasi Vol.21, No.1, 2022.
E. L. W. Ningrum, dan A. P. Widodo, "Implementasi Metode Multinomial Naïve Bayes Classifier Untuk Analisis Sentimen", Journal of Fundamental Mathematics and Applications (JFMA), vol. 1, no. 2, pp. 128-137, Nov. 2018
"Understanding TF-IDF (Term Frequency-Inverse Document Frequency)", 2023. https://www.geeksforgeeks.org/understanding-tf-idf-term-frequency-inverse-document-frequency (accessed Jan. 26, 2023).
J. C. W. Pantouw, "Perbandingan Klasifikasi Rocchio Dan Multinomial Naïve Bayes Pada Analisis Sentimen Data Twitter Bahasa Indonesia", Departemen Ilmu Komputer Fakultas Matematika dan Ilmu Pengetahuan Alam Institut Pertanian Bogor, 2017.
J. B. Kelvin, E. Indra, S. H. Sinurat, “Analisis Perbandingan Sentimen Coronavirus Disease-2019 (Covid19) Pada Twitter Menggunakan Metode Logistic Regression Dan Support Vector Machine (Svm)”, Sistem Informasi, FTIK, Universitas Prima Indonesia Jalan Sampul, Medan,2022
R. I. Pristiyanti, M. A. Fauzi, L. Muflikhah, “Sentiment Analysis Peringkasan Review Film Menggunakan Metode Information Gain Dan K-Nearest Neighbor”, Program Studi Teknik Informatika, Fakultas Ilmu Komputer, Universitas Brawijaya, 2018.
"Classification report in Machine Learning", 2021. https://thecleverprogrammer.com/2021/07/07/classification-report-in-machine-learning (accessed Jan. 27, 2023)
"Confusion matrix", 2020. https://socs.binus.ac.id/2020/11/01/confusion-matrix (accessed Jan. 27, 2023)
Copyright (c) 2023 Audenza Maulana, Inayah Khasnaputri Afifah, Asghafi Mubarrak, Kiagus Rachmat Fauzan, Ardhan Dwintara
This work is licensed under a Creative Commons Attribution 4.0 International License.