WORD EMBEDDING OPTIMIZATION IN SENTIMENT ANALYSIS OF REVIEWS ON MYTELKOMSEL APP USING LONG SHORT-TERM MEMORY AND SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE

  • Muhammad Raffif Haziq Informatics, School of Computing, Telkom University, Indonesia
  • Yuliant Sibaroni Informatics, School of Computing, Telkom University, Indonesia
  • Sri Suryani Prasetyowati Informatics, School of Computing, Telkom University, Indonesia
Keywords: FastText, GloVe, LSTM-SMOTE, Sentiment analysis, Word Embedding, Word2vec

Abstract

Telkomsel is one of the internet service provider companies that has a mobile-based application called MyTelkomsel which functions to facilitate users in conducting online services independently. Users of the application certainly have their own responses about the application, so that users can provide responses to the application. Therefore, sentiment analysis can be one of the solutions to find out public sentiment towards the application. In this research, the author builds a system for sentiment analysis using word embedding Word2vec, GloVe, FastText to get word representation in vector form with classification using Long Short-Term Memory (LSTM) combined with Synthetic Minority Over-sampling Technique (SMOTE) which can handle data imbalance. The data used comes from user reviews of the MyTelkomsel application found on the Google Play Store. This study compares the performance of several word embedding in LSTM and LSTM-SMOTE classifiers. The results showed the results show that the performance of three-word embedding on the LSTM model is superior compared to the LSTM-SMOTE model. Overall, it was found that the combination of FastText and LSTM gave the best performance compared to the other five combinations with an accuracy value of 89.11%.

Downloads

Download data is not yet available.

References

W. H. Ali and M. Ariyanti, “Hyper-Segmentation Lapser MyTelkomsel Apps Using K-Means Clustering to Increase Data Package Purchases in Area 3-East Java, Central Java-DIY, Bali Nusa Tenggara,” Bandung, Jun. 2023.

A. Ibrahim, F. S. Elisa, J. Fernando, L. Salsabila, N. Anggraini, and S. N. Arafah, “Pengaruh E-Service Quality Terhadap Loyalitas Pengguna Aplikasi MyTelkomsel,” Building of Informatics, Technology and Science (BITS), vol. 3, no. 3, pp. 302–311, Dec. 2021, doi: 10.47065/bits.v3i3.1076.

S. Pandya and P. Mehta, “A Review On Sentiment Analysis Methodologies, Practices And Applications,” International Journal Of Scientific & Technology Research, vol. 9, p. 2, Feb. 2020, [Online].

G. Dharani Devi and Dr. S .Kamalakkannan, " Literature Review on Sentiment Analysis in Social Media: Open Challenges toward Applications", International Journal of Advanced Science and Technology, Vol. 29, No. 7, pp. 1462-1471, (2020)

Jain, P.K. and Pamula, R., “A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews”, Computer Science Review, Vol. 41, (2021).

R. Indra Kurnia and A. Suganda Girsang, “Classification of User Comment Using Word2vec and Deep Learning,” Mar. 2021, doi: 10.25046/aj060264.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Jan. 2013, [Online]. Available: http://arxiv.org/abs/1301.3781

B. Jang, I. Kim, and J. W. Kim, “Word2vec convolutional neural networks for classification of news articles and tweets,” PLoS One, vol. 14, no. 8, Aug. 2019, doi: 10.1371/journal.pone.0220976.

J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global Vectors for Word Representation,” Doha, Qatar, Oct. 2014. [Online]. Available: http://nlp.

Mishev, K., Gjorgjevikj, A., Vodenska, I., Chitkushev, L. T., & Trajanov, D. (2020). Evaluation of sentiment analysis in finance: from lexicons to transformers. IEEE access, 8, 131662-131682.

E. M. Dharma, F. Lumban Gaol, H. Leslie, H. S. Warnars, and B. Soewito, “The Accuracy Comparison Among Word2vec, GloVe, AND FastText Towards Convolution Neural Network (CNN) Text Classification,” J Theor Appl Inf Technol, vol. 31, no. 2, 2022, [Online]. Available: www.jatit.org.

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching Word Vectors with Subword Information,” vol. 5, pp. 135–146, 2017, doi: 10.1162/tacl_a_00051/1567442/tacl_a_00051.pdf.

N. Alvi Hasanah, Nanik Suciati, and Diana Purwitasari, “Pemantauan Perhatian Publik terhadap Pandemi COVID-19 melalui Klasifikasi Teks dengan Deep Learning,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 1, pp. 193–202, Feb. 2021, doi: 10.29207/resti.v5i1.2927.

Johnson, R., & Zhang, T. (2015). Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics.

Ahmad, S., Ridwan, A. M., & Setyawan, G. D. (2023). Analisis Sentimen Product Tools & Home Menggunakan Metode Cnn Dan Lstm. Teknokom, 6(2), 133-140.

Tala, F. Z., "A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia". M.Sc. Thesis. Master of Logic Project. Institute for Logic, Language and Computation. Universiteit van Amsterdam, The Netherlands, 2003.

S. Al-Saqqa and A. Awajan, “The Use of Word2vec Model in Sentiment Analysis: A Survey,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Dec. 2019, pp. 39–43. doi: 10.1145/3388218.3388229.

S. Li and B. Gong, “Word embedding and text classification based on deep learning methods,” MATEC Web of Conferences, vol. 336, p. 06022, 2021, doi: 10.1051/matecconf/202133606022.

S. Sivakumar, L. S. Videla, T. R. Kumar, J. Nagaraj, S. Itnal, and D. Haritha, Review on Word2Vec Word Embedding Neural Net. India: 2020 International Conference on Smart Electronics and Communication (ICOSEC), 2020. doi: 10.1109/ICOSEC49089.2020.9215319.

A. Amalia, O. S. Sitompul, E. B. Nababan, and T. Mantoro, An Efficient Text Classification Using fastText for Bahasa Indonesia Documents Classification. Medan: 2020 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA), 2020. doi: 10.1109/DATABIA50434.2020.9190447.

J. C. Young and A. Rusli, “Review and Visualization of Facebook’s FastText Pretrained Word Vector Model,” pp. 1–6, Aug. 2019, doi: 10.1109/ICESI.2019.8863015.

A. Alessa, M. Faezipour, and Z. Alhassan, “Text classification of flu-related tweets using FastText with sentiment and keyword features,” in Proceedings - 2018 IEEE International Conference on Healthcare Informatics, ICHI 2018, Institute of Electrical and Electronics Engineers Inc., Jul. 2018, pp. 366–367. doi: 10.1109/ICHI.2018.00058.

T. Yao, Z. Zhai, and B. Gao, Text Classification Model Based on fastText. Dalian: Proceedings of 2020 IEEE International Conference on Artificial Intelligence and Information Systems: ICAIIS, 2020. doi: 10.1109/ICAIIS49377.2020.9194939.

M. A. Riza and N. Charibaldi, “Emotion Detection in Twitter Social Media Using Long Short-Term Memory (LSTM) and Fast Text,” International Journal of Artificial Intelligence & Robotics (IJAIR), vol. 3, no. 1, pp. 15–26, May 2021, doi: 10.25139/ijair.v3i1.3827.

N. K. Sirohi, “Categorization of Text using Long Short-Term Memory with Glove,” 2023, doi: 10.21203/rs.3.rs-3239199/v1.

M. A. Nurrohmat and A. SN, “Sentiment Analysis of Novel Review Using Long Short-Term Memory Method,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 13, no. 3, p. 209, Jul. 2019, doi: 10.22146/ijccs.41236.

S. Siami-Namini, N. Tavakoli, and A. S. Namin, The performance of LSTM and BiLSTM in forecasting time series. Los Angeles, CA: 2019 IEEE International Conference on Big Data (Big Data), 2019. doi: 10.1109/BigData47090.2019.9005997.

V. Rupapara, F. Rustam, H. F. Shahzad, A. Mehmood, I. Ashraf, and G. S. Choi, “Impact of SMOTE on Imbalanced Text Features for Toxic Comments Classification Using RVVC Model,” IEEE Access, vol. 9, pp. 78621–78634, 2021, doi: 10.1109/ACCESS.2021.3083638.

P. Jeatrakul, K. Wai Wong, and C. Che Fung, “Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm,” 2010.

V. M. Patro and M. Ranjan Patra, “Augmenting Weighted Average with Confusion Matrix to Enhance Classification Accuracy,” Transactions on Machine Learning and Artificial Intelligence, vol. 2, no. 4, Aug. 2014, doi: 10.14738/tmlai.24.328.

A. Giachanou and F. Crestani, “Like it or not: A survey of Twitter sentiment analysis methods,” ACM Computing Surveys, vol. 49, no. 2. Association for Computing Machinery, Jun. 01, 2016. doi: 10.1145/2938640.

A. Margaretha and N. Helena, " COMPARISON PERFORMANCE OF WORD2VEC, GLOVE, FASTTEXT USING SUPPORT VECTOR MACHINE METHOD FOR SENTIMENT ANALYSIS", Jurnal Teknik Informatika (JUTIF), Vol. 5, No. 3, pp. 669-674, June 2024, doi: 10.52436/1.jutif.2024.5.3.1366.

Published
2024-12-28
How to Cite
[1]
M. R. Haziq, Y. Sibaroni, and S. S. Prasetyowati, “WORD EMBEDDING OPTIMIZATION IN SENTIMENT ANALYSIS OF REVIEWS ON MYTELKOMSEL APP USING LONG SHORT-TERM MEMORY AND SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE”, J. Tek. Inform. (JUTIF), vol. 5, no. 6, pp. 1581-1589, Dec. 2024.