Aspect-Based Sentiment Analysis of Access by KAI Application Reviews Using IndoBERT for Multi-Label Classification Tasks

Hilda Nur Alfiana; Afrizal Doewes; Bambang Widoyono

doi:10.52436/1.jutif.2026.7.1.5402

Authors

Hilda Nur Alfiana Informatics, Universitas Sebelas Maret, Indonesia
Afrizal Doewes Informatics, Universitas Sebelas Maret, Indonesia
Bambang Widoyono Informatics, Universitas Sebelas Maret, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.1.5402

Keywords:

Access by KAI, Aspect-based sentiment analysis, IndoBERT, Multi-label classification, Sentiment analysis

Abstract

Ratings and reviews on mobile applications provide valuable insights into user experience and satisfaction with app features and services. However, ratings are subjective and often inconsistent with the content of the reviews. Therefore, a more in-depth analysis of the review content is necessary to identify evaluation points accurately. This study aims to evaluate the performance of IndoBERT in Aspect-Based Sentiment Analysis (ABSA) on Access by KAI application reviews. Data were collected by scraping user reviews from the Google Play Store, then annotated using a hybrid labeling approach. The resulting dataset was used to fine-tune the IndoBERT model across three ABSA tasks: aspect classification, sentiment classification for each aspect, and joint aspect-sentiment classification. We also benchmarked the model against baseline models to demonstrate its effectiveness. The results show that IndoBERT achieved the best performance across all tasks, specifically aspect classification (accuracy 0.928, F1-score 0.785), sentiment classification (accuracy 0.928, F1-score 0.752), and joint aspect-sentiment classification (accuracy 0.962, F1-score 0.549). Overall, IndoBERT successfully outperformed SVM and XGBoost with TF-IDF, BiLSTM with pre-trained IndoBERT embeddings, mBERT, and XLM-R. This study contributes a new dataset that provides resources for further research and development in Indonesian Natural Language Processing (NLP). These findings also highlight the advantages of a monolingual model trained specifically on Indonesian-language data.

Downloads

Download data is not yet available.

References

Badan Pusat Statistik Indonesia, “Jumlah Penumpang Kereta Api, 2024.” Accessed: Mar. 06, 2025. [Online]. Available: https://www.bps.go.id/id/statistics-table/2/NzIjMg==/jumlah-penumpang-kereta-api.html

S. Sadiq, M. Umer, S. Ullah, S. Mirjalili, V. Rupapara, and M. Nappi, “Discrepancy detection between actual user reviews and numeric ratings of Google App store using deep learning,” Expert Systems with Applications, vol. 181, p. 115111, 2021, doi: https://doi.org/10.1016/j.eswa.2021.115111.

M. Demircan, A. Seller, F. Abut, and M. F. Akay, “Developing Turkish sentiment analysis models using machine learning and e-commerce data,” International Journal of Cognitive Computing in Engineering, vol. 2, pp. 202–207, June 2021, doi: 10.1016/j.ijcce.2021.11.003.

G. Kontonatsios et al., “FABSA: An aspect-based sentiment analysis dataset of user reviews,” Neurocomputing, vol. 562, p. 126867, Dec. 2023, doi: 10.1016/j.neucom.2023.126867.

Dyah Ayu Wulandari, Fitra Abdurrachman Bachtiar, and Indriati, “Aspect Based Sentiment Analysis on Shopee Application Reviews Using Support Vector Machine,” Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, vol. 15, no. 02, pp. 99–111, Oct. 2025, doi: 10.24843/LKJITI.2024.v15.i02.p03.

M. A. Palimbani, R. P. Hastuti, and R. A. Rajagede, “Analisis Sentimen Berbasis Aspek Pada Ulasan Pengguna Aplikasi Starbucks Menggunakan Algoritma Support Vector Machine,” Journal of Internet and Software Engineering, vol. 5, no. 1, pp. 43–49, May 2024, doi: 10.22146/jise.v5i1.9130.

N. Yuniar and A. Musdholifah, “Multiclassifier for Aspect-Based Sentiment Analysis on Indonesian Reviews of Kredit Pintar Online Lending App,” in 2024 12th International Conference on Information and Communication Technology (ICoICT), 2024, pp. 382–389. doi: 10.1109/ICoICT61617.2024.10698093.

M. Ishaq, D. Lestari, and M. Marchenko, “Implementation of Aspect-Based Sentiment Analysis on the Mitra Darat App User Reviews Using Machine Learning,” KLIK: Kajian Ilmiah Informatika dan Komputer, vol. 4, no. 6, pp. 2763–2776, 2024, doi: 10.30865/klik.v4i6.1889.

Y. Azarya and I. Budi, “Analisis Sentimen Berbasis Aspek Aplikasi Brimo berdasarkan Ulasan Pengguna di Google Playstore,” The Indonesian Journal of Computer Science, vol. 14, no. 1, pp. 1111–1125, Feb. 2025, doi: 10.33022/ijcs.v14i1.4613.

R. Hardiartama, A. A. Arifiyanti, and S. F. A. Wati3, “Application of Ensemble Machine Learning Methods for Aspect-Based Sentiment Analysis on User Reviews of the Wondr by BNI App,” Jurnal Teknologi dan Open Source, vol. 8, no. 1, pp. 97–111, June 2025, doi: 10.36378/jtos.v8i1.4297.

E. Subowo, F. A. Artanto, I. Putri, and W. Umaedi, “BLTSM untuk analisis sentimen berbasis aspek pada aplikasi belanja online dengan cicilan,” Jurnal FASILKOM, vol. 12, no. 2, pp. 132–140, 2022, doi: https://doi.org/10.37859/jf.v12i2.3759.

P. A. Aritonang, M. E. Johan, and I. Prasetiawan, “Aspect-Based Sentiment Analysis on Application Review using CNN (Case Study : Peduli Lindungi Application),” Ultima Infosys : Jurnal Ilmu Sistem Informasi, vol. 13, no. 1, pp. 54–61, 2022, doi: https://doi.org/10.31937/si.v13i1.2684.

N. I. Lestari, S. M. Taib, W. Wibowo, I. A. Aziz, and M. R. Habibi, “Aspect-Based Sentiment Analysis for Mobile App Review Using Convolutional Neural Network (CNN) and Word2Vec,” in 2024 IEEE 7th International Conference on Electrical, Electronics and System Engineering (ICEESE), 2024, pp. 1–6. doi: 10.1109/ICEESE62315.2024.10828541.

H. Juandri, Hasmawati, and Bunyamin, “Aspect-level Sentiment Analysis on GoPay App Reviews Using Multilayer Perceptron and Word Embeddings,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, vol. 9, no. 4, Nov. 2024, doi: 10.22219/kinetik.v9i4.2041.

H. Mustakim and S. Priyanta, “Aspect-Based Sentiment Analysis of KAI Access Reviews Using NBC and SVM,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 16, no. 2, p. 113, Apr. 2022, doi: 10.22146/ijccs.68903.

S. H. Maghfiroh, D. E. Ratnawati, and P. P. Adikara, “Analisis Sentimen Berbasis Aspek Menggunakan Support Vector Machine dan Binary Relevance Terhadap Aplikasi Access by KAI,” Universitas Brawijaya, 2024.

F. A. I. Faradita and Y. Sibaroni, “Multi-Aspect Sentiment Analysis on KAI Access App Reviews (Google Play Store) Using CNN-LSTM Method,” in 2025 International Conference on Data Science and Its Applications (ICoDSA), Institute of Electrical and Electronics Engineers (IEEE), Sept. 2025, pp. 1106–1111. doi: 10.1109/icodsa67155.2025.11157016.

R. A. Pranata, I. Hidayah, and S. A. I. Alfarozi, “Aspect Based Sentiment Analysis of PLN Customer Complaints Data Using BERT to Improve Services,” in 2024 16th International Conference on Information Technology and Electrical Engineering (ICITEE), 2024, pp. 258–263. doi: 10.1109/ICITEE62483.2024.10808802.

E. Yulianti and N. K. Nissa, “ABSA of Indonesian customer reviews using IndoBERT: single-sentence and sentence-pair classification approaches,” Bulletin of Electrical Engineering and Informatics, vol. 13, no. 5, pp. 3579–3589, Oct. 2024, doi: 10.11591/eei.v13i5.8032.

B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” in Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Suzhou, China: Association for Computational Linguistics, 2020, pp. 843–857. doi: 10.18653/v1/2020.aacl-main.85.

A. Jazuli, Widowati, and R. Kusumaningrum, “Optimizing Aspect-Based Sentiment Analysis Using BERT for Comprehensive Analysis of Indonesian Student Feedback,” Applied Sciences, vol. 15, no. 1, p. 172, 2025, doi: 10.3390/app15010172.

R. A. Rahman, V. H. Pranatawijaya, and N. N. K. Sari, “Analisis Sentimen Berbasis Aspek pada Ulasan Aplikasi Gojek,” KONSTELASI: Konvergensi Teknologi dan Sistem Informasi, vol. 4, no. 1, pp. 70–82, 2024, doi: https://doi.org/10.24002/konstelasi.v4i1.8922.

A. R. Putra and D. E. Ratnawati, “Analisis Sentimen Berbasis Aspek pada Aplikasi Mobile Menggunakan Naïve Bayes berdasarkan Ulasan Pengguna Playstore (Studi Kasus : Jconnect Mobile),” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 12, no. 2, pp. 293–300, Apr. 2025, doi: 10.25126/jtiik.2025127556.

N. Nuryani, R. Munir, A. Purwarianti, and D. Puji Lestari, “BERT-Based Model and LLMs-Generated Synthetic Data for Conflict Sentiment Identification in Aspect-Based Sentiment Analysis,” Interdisciplinary Journal of Information, Knowledge, and Management, vol. 20, p. 004, 2025, doi: 10.28945/5439.

J. Cohen, “A Coefficient of Agreement for Nominal Scales,” Educational and Psychological Measurement, vol. 20, no. 1, pp. 37–46, 1960, doi: 10.1177/001316446002000104.

J. R. Landis and G. G. Koch, “The Measurement of Observer Agreement for Categorical Data,” Biometrics, vol. 33, no. 1, p. 159, Mar. 1977, doi: 10.2307/2529310.

R. Feldman and J. Sanger, The text mining handbook : advanced approaches in analyzing unstructured data. Cambridge University Press, 2007.

L. Owen and I. Putra, “NLP Bahasa Indonesia Resources.” Accessed: Feb. 21, 2025. [Online]. Available: https://github.com/louisowen6/NLP_bahasa_resources

A. Nayak, H. Timmapathini, K. Ponnalagu, and V. Gopalan Venkoparao, “Domain adaptation challenges of BERT in tokenization and sub-word representations of Out-of-Vocabulary words,” in Proceedings of the First Workshop on Insights from Negative Results in NLP, Online: Association for Computational Linguistics, 2020, pp. 1–5. doi: 10.18653/v1/2020.insights-1.1.

V. Lumumba, D. Sang, G. Njoka, D. Musyimi, and Kavita, “Comparative Analysis of Cross-Validation Techniques: LOOCV, K-folds Cross-Validation, and Repeated K-folds Cross-Validation in Machine Learning Models,” American Journal of Theoretical and Applied Statistics, vol. 13, pp. 127–137, Oct. 2024, doi: 10.11648/j.ajtas.20241305.13.

S. Prusty, S. Patnaik, and S. K. Dash, “SKCV: Stratified K-fold cross-validation on ML classifiers for predicting cervical cancer,” Frontiers in Nanotechnology, vol. 4, p. 972421, Aug. 2022, doi: 10.3389/fnano.2022.972421.

M. T. R, V. K. V, D. K. V, O. Geman, M. Margala, and M. Guduri, “The stratified K-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification,” Healthcare Analytics, vol. 4, p. 100247, 2023, doi: https://doi.org/10.1016/j.health.2023.100247.

T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A Next-generation Hyperparameter Optimization Framework,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery, 2019, pp. 2623–2631. doi: 10.1145/3292500.3330701.

B. Bischl et al., “Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges,” WIREs Data Mining and Knowledge Discovery, vol. 13, no. 2, p. e1484, Mar. 2023, doi: https://doi.org/10.1002/widm.1484.

A. P. Adhi, K. Umuri, and G. Triyono, “SENTIMENT ANALYSIS AND ENTITY DETECTION ON NEWS HEADLINES TO SUPPORT INVESTMENT DECISIONS,” Jurnal Teknik Informatika (Jutif), vol. 5, no. 6, pp. 1801–1810, Dec. 2024, doi: 10.52436/1.jutif.2024.5.6.3434.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), J. Burstein, C. Doran, and T. Solorio, Eds., Association for Computational Linguistics, June 2019, pp. 4171–4186. doi: 10.18653/v1/N19-1423.

M. Heydarian, T. E. Doyle, and R. Samavi, “MLCM: Multi-Label Confusion Matrix,” IEEE Access, vol. 10, pp. 19083–19095, 2022, doi: 10.1109/ACCESS.2022.3151048.

M. C. Hinojosa Lee, J. Braet, and J. Springael, “Performance Metrics for Multilabel Emotion Classification: Comparing Micro, Macro, and Weighted F1-Scores,” Applied Sciences, vol. 14, no. 21, p. 9863, Oct. 2024, doi: 10.3390/app14219863.

A. Conneau et al., “Unsupervised Cross-lingual Representation Learning at Scale,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds., Association for Computational Linguistics, July 2020, pp. 8440–8451. doi: 10.18653/v1/2020.acl-main.747.

E. F. Aprilia, A. A. Arifiyanti, and N. Sembilu, “Aspect-Based Sentiment Analysis on User Perceptions of OVO using Latent Dirichlet Allocation and Support Vector Machine,” Aviation Electronics, Information Technology, Telecommunications, Electricals, and Controls (AVITEC), vol. 7, no. 2, p. 163, June 2025, doi: 10.28989/avitec.v7i2.3035.

M. A. A. O. Putri, I. W. Sumarjaya, and I. G. N. L. Wijayakusuma, “Aspect-Based Sentiment Analysis of Reviews for Pandawa Beach Using Naive Bayes and SVM Methods,” Journal of Applied Informatics and Computing, vol. 9, no. 2, pp. 305–313, Mar. 2025, doi: 10.30871/jaic.v9i2.9083.

E. C. Narendra, A. A. Arifiyanti, and T. L. I. Sugata, “Enhancing Aspect-Based Sentiment Analysis in Imbalanced Multilabel Datasets using Resampling and Classifiers for Digital Signature Applications,” Aviation Electronics, Information Technology, Telecommunications, Electricals, and Controls (AVITEC), vol. 7, no. 2, p. 195, June 2025, doi: 10.28989/avitec.v7i2.3023.