TEXT MINING WITH LATENT DIRICHLET ALLOCATION FOR ANALYZING PUBLIC COMMENTS ON THE M-PASSPORT APPLICATION
Abstract
The M-Passport application is a service application developed by the Directorate General of Immigration of Indonesia to assist the public in applying for new passports and replacing passports online. However, in its implementation, this application has not been able to give satisfaction to its users. It is proven by the low rating of the application and the numerous negative comments on the Google Play Store. One way to identify the application's shortcomings is by analyzing user comments. In analyzing the abundance of comment data, this study utilizes the text mining method with Latent Dirichlet Allocation (LDA) topic modeling. The analysis with this method aims to find topics frequently discussed in comments so that the government can identify the shortcomings of the M-Passport application. The results of comment analysis with LDA topic modeling produced seven topics, from which three topics with the highest coherence values were selected. These three topics are then interpreted to obtain information about the public's concerns regarding the M-Passport application. The results of this interpretation include users frequently failing to log in or register to the M-Passport application, users feeling that the M-Passport application does not assist them in passport management due to constraints in the online queue feature, and some users still finding it difficult to use the M-Passport application.
Downloads
References
Badan Pusat Statistik Republik Indonesia, Statistik Telekomunikasi Indonesia 2022. Jakarta: Badan Pusat Statistik, 2023.
Google, “M-Paspor,” Google Play Store. Accessed: Jun. 28, 2023. [Online]. Available: https://play.google.com/store/apps/details?id=id.go.imigrasi.paspor_online&hl=id
Y. Kalepalli, P. D. P. Teja, S. Tasneem, and S. Manne, “Effective Comparison of LDA with LSA for Topic Modelling,” in Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS 2020), 2020.
H. Hassani, C. Beneki, S. Unger, M. T. Mazinani, and M. R. Yeganegi, “Text Mining in Big Data Analytics,” Big Data Cogn. Comput., vol. 4, no. 1, pp. 1–34, 2020, doi: 10.3390/bdcc4010001.
A. Roihan, P. A. Sunarya, and A. S. Rafika, “Pemanfaatan Machine Learning dalam Berbagai Bidang: Review paper,” IJCIT (Indonesian J. Comput. Inf. Technol., vol. 5, no. 1, pp. 75–82, 2020, doi: 10.31294/ijcit.v5i1.7951.
M. N. Faiz, O. Somantri, A. R. Supriyono, and A. W. Muhammad, “Impact of Feature Selection Methods on Machine Learning-based for Detecting DDoS Attacks : Literature Review,” J. Informatics Telecommun. Eng., vol. 5, no. 2, pp. 305–314, 2022, doi: 10.31289/jite.v5i2.6112.
V. Vukanti and A. Jose, “Business Analytics: A Case-study Approach Using LDA Topic Modelling,” in Proceedings - 5th International Conference on Computing Methodologies and Communication, ICCMC 2021, IEEE, 2021, pp. 1818–1823. doi: 10.1109/ICCMC51019.2021.9418344.
F. Gurcan, O. Ozyurt, and N. E. Cagiltay, “Investigation of Emerging Trends in the E-Learning Field Using Latent Dirichlet Allocation,” Int. Rev. Res. Open Distance Learn., vol. 22, no. 2, pp. 1–18, 2021, doi: 10.19173/irrodl.v22i2.5358.
Y. Zhang and L. Zhang, “Movie Recommendation Algorithm Based on Sentiment Analysis and LDA,” in The 8th International Conference Technology and Quantitative Management (ITQM 2020 & 2021), Elsevier B.V., 2021, pp. 871–878. doi: 10.1016/j.procs.2022.01.109.
J. H. Lee and M. J. Ostwald, “Latent Dirichlet Allocation (LDA) Topic Models for Space Syntax Studies on Spatial Experience,” City, Territ. Archit., vol. 11, no. 3, pp. 1–20, 2024, doi: 10.1186/s40410-023-00223-3.
Y. Sahria and D. Hatta Fudholi, “Analisis Topik Penelitian Kesehatan di Indonesia Menggunakan Metode Topic Modeling LDA (Latent Dirichlet Allocation),” J. Rekayasa Sist. dan Teknol. Inf., vol. 4, no. 2, pp. 336–344, 2020.
A. R. D. Astuti and N. Cahyono, “Analisis Topic Modelling Persepsi Pengguna Internet Menggunakan Metode Latent Dirichlet Allocation,” Indones. J. Comput. Sci., vol. 12, no. 1, pp. 326–334, 2023, doi: 10.33022/ijcs.v12i1.3155.
S. Suparyati, E. Utami, and A. Fathurahman, “Pengamatan Tren Ulasan Hotel Menggunakan Pemodelan Topik Berbasis Latent Dirichlet Allocation,” J. Appl. Informatics Comput., vol. 6, no. 1, pp. 71–77, 2022, doi: 10.30871/jaic.v6i1.3645.
A. Syaifuddin, R. A. Harianto, and J. Santoso, “Analisis Trending Topik untuk Percakapan Media Sosial dengan Menggunakan Topic Modelling Berbasis Algoritme LDA,” INSYST J. Intell. Syst. Comput., vol. 2, no. 1, pp. 12–19, 2020, doi: https://doi.org/10.52985/insyst.v2i1.150.
X. Cheng, Q. Cao, and S. S. Liao, “An Overview of Literature on COVID-19, MERS and SARS: Using Text Mining and Latent Dirichlet Allocation,” J. Inf. Sci., vol. 48, no. 3, pp. 304–320, 2022, doi: 10.1177/0165551520954674.
M. A. Khder, “Web Scraping or Web cRawling: State of Art, Techniques, Approaches and Application,” Int. J. Adv. Soft Comput. its Appl., vol. 13, no. 3, pp. 144–168, 2021, doi: 10.15849/ijasca.211128.11.
L. Hickman, S. Thapa, L. Tay, M. Cao, and P. Srinivasan, “Text Preprocessing for Text Mining in Organizational Research: Review and Recommendations,” Organ. Res. Methods, vol. 25, no. 1, pp. 114–146, Jan. 2022, doi: 10.1177/1094428120971683.
M. Nesca, A. Katz, C. K. Leung, and L. M. Lix, “A Scoping Review of Preprocessing Methods for Unstructured Text Data to Assess Data Quality,” Int. J. Popul. Data Sci., vol. 7, no. 1, 2022, doi: 10.23889/ijpds.v6i1.1757.
Y. Matira, Junaidi, and I. Setiawan, “Pemodelan Topik pada Judul Berita Online Detikcom Menggunakan Latent Dirichlet Allocation,” Estimasi J. Stat. Its Appl., vol. 4, no. 1, pp. 2721–379, 2023, doi: 10.20956/ejsa.vi.24843.
D. Ferarizki, M. Fikry, F. Yanto, and F. Insani, “Klasifikasi Sentimen Masyarakat di Twitter Terhadap Ancaman Resesi Ekonomi 2023 dengan Metode K-Nearest Neighbor,” Kaji. Ilm. Inform. dan Komput., vol. 4, no. 2, pp. 1111–1120, 2023, doi: 10.30865/klik.v4i2.1315.
K. Bastani, H. Namavari, and J. Shaffer, “Latent Dirichlet Allocation (LDA) for Topic Modeling of the CFPB Consumer Complaints,” Expert Syst. Appl., vol. 127, pp. 256–271, Aug. 2019, doi: 10.1016/j.eswa.2019.03.001.
O. Kononova, T. He, H. Huo, A. Trewartha, E. A. Olivetti, and G. Ceder, “Opportunities and Challenges of Text Mining in Materials Research,” iScience, vol. 24, no. 3, 2021, doi: 10.1016/j.isci.2021.102155.
D. L. C. Pardede and M. A. I. Waskita, “Analisis Pemodelan Topik untuk Ulasan Tentang Peduli Lindungi,” J. Ilm. Inform. Komput., vol. 28, no. 1, pp. 17–26, 2023, doi: 10.35760/ik.2023.v28i1.7925.
R. Prabowo, H. Sujaini, and T. Rismawan, “Analisis Sentimen Pengguna Twitter Terhadap Kasus COVID-19 di Indonesia Menggunakan Metode Regresi Logistik Multinomial,” J. Sist. dan Teknol. Inf., vol. 11, no. 1, pp. 85–90, Jan. 2023, doi: 10.26418/justin.v11i1.57450.
S. Zhou, P. Kan, Q. Huang, and J. Silbernagel, “A Guided Latent Dirichlet Allocation Approach to Investigate Real-time Latent Topics of Twitter Data During Hurricane Laura,” J. Inf. Sci., vol. 49, no. 2, pp. 465–479, Apr. 2023, doi: 10.1177/01655515211007724.
D. Maulidiya, “Topic Modelling using Latent Dirichlet Allocation (LDA) to Investigate the Latent Topics of Mathematical Creative Thinking Research in Indonesia,” J. Intell. Comput. Heal. Inform., vol. 3, no. 2, pp. 34–35, 2022, doi: 10.26714/jichi.v3i2.11428.
N. L. P. Merawati, A. Z. Amrullah, and Ismarmiaty, “Analisis Sentimen dan Pemodelan Topik Pariwisata Lombok Menggunakan Algoritma Naive Bayes dan Latent Dirichlet Allocation,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 123–131, Feb. 2021, doi: 10.29207/resti.v5i1.2587.
U. T. Setijohatmo, S. Rachmat, T. Susilawati, and Y. Rahman, “Analisis Metoda Latent Dirichlet Allocation untuk Klasifikasi Dokumen Laporan Tugas Akhir Berdasarkan Pemodelan Topik,” in Prosiding The 11th Industrial Research Workshop and National Seminar, Bandung, 2020, pp. 402–408.
D. Z. T. Kannitha, M. Mustafid, and P. Kartikasari, “Pemodelan Topik pada Keluhan Pelanggan Menggunakan Algoritma Latent Dirichlet Allocation dalam Media Sosial Twitter,” J. Gaussian, vol. 11, no. 2, pp. 266–277, 2022, doi: 10.14710/j.gauss.v11i2.35474.
S. Bellaouar, M. M. Bellaouar, and I. E. Ghada, “Topic modeling: Comparison of LSA and LDA on Scientific Publications,” in 2021 4th International Conference on Data Storage and Data Engineering (DSDE’21), Association for Computing Machinery, Feb. 2021, pp. 59–64. doi: 10.1145/3456146.3456156.
A. Muhaimin et al., “Social Media Analysis and Topic Modeling: Case Study of Stunting in Indonesia,” J. Inform. dan Teknol. Inf., vol. 20, no. 3, pp. 406–415, 2023, doi: 10.31515/telematika.v20i3.10797.
N. A. Sanjaya, “lmplementasi Latent Dirichlet Allocation (LDA) untuk Klasterisasi Cerita Berbahasa Bali,” J. Teknol. Inf. dan Ilmu Komput., vol. 8, no. 1, pp. 127–134, 2021, doi: 10.25126/jtiik.202183556.
B. A. Tondang, M. R. Fadhil, M. N. Perdana, A. Fauzi, and U. S. Janitra, “Analisis Pemodelan Topik Ulasan Aplikasi BNI, BCA, dan BRI Menggunakan Latent Dirichlet Allocation,” INFOTECH J. Inform. Teknol., vol. 4, no. 1, pp. 114–127, 2023, doi: 10.37373/infotech.v4i1.601.
Copyright (c) 2024 Theresia Shinta Hapsari, Yessica Nataliani
This work is licensed under a Creative Commons Attribution 4.0 International License.