COMPARISON PERFORMANCE OF WORD2VEC, GLOVE, FASTTEXT USING SUPPORT VECTOR MACHINE METHOD FOR SENTIMENT ANALYSIS
Abstract
Spotify is a digital audio service that provides music and podcasts. Reviews received by the application can affect users who will download the application. The unstructured characteristic of review text is a challenge in text processing. To produce a valid sentiment analysis, word embedding is required. The data set that is owned is divided by a ratio of 80:20 for training data and testing data. The method used for feature expansion is Word2Vec, GloVe, and FastText and the method used in classification is Support Vector Machine (SVM). The three word embedding methods were chosen because they can capture semantic, syntactic, and contextual meanings around words when compared to traditional engineering features such as Bag of Word. The best performance evaluation results show that the GloVe model produces the best performance compared to other word embeddings with an accuracy value of 85%, a precision value of 90%, a recall value of 79%, and an f1-score of 85%.
Downloads
References
D. McQuail, Teori Komunikasi Massa McQuail 1, 6E (6th ed.). Salemba Humanika, 2011.
Spotify, About Us, Spotify, 2022 https://www.spotify.com/us/about-us/contact/ (accessed Sep 16, 2022).
F. V. Sari and A. Wibowo, "Analisis Sentimen Pelanggan Toko Online Jd.Id Menggunakan Metode Naïve Bayes Classifier Berbasis Konversi Ikon Emosi" Jurnal SIMETRIS, vol. 10, no. 2, pp. 2252–4983, 2019.
G. A. Buntoro, "Analisis Sentimen Calon Gubernur DKI Jakarta 2017 Di Twitter," Integer Journal, vol. 2, no. 1, pp. 32–41, 2017. https://t.co/jrvaMsgBdH
A. Nurdin, B. A. S. Aji, A. Bustamin, and Z. Abidin, "Perbandingan Kinerja Word Embedding Word2vec, Glove, Dan Fasttext Pada Klasifikasi Teks", Jurnal TEKNOKOMPAK, vol. 14, no. 2, pp. 74–79, 2020.
M. D. Rhman, A. Dhunaidy, and f. Mahananto, "Penerapan Weighted Word Embedding pada Pengklasifikasian Teks Berbasis Recurrent Neural Network untuk Layanan Pengaduan Perusahaan Transportasi," JURNAL SAINS DAN SENI ITS, vol. 10, no. 1, pp. 2337–3520, 2021.
S. Fransiska, Rianto, and A. I. Gufroni, "Sentiment Analysis Provider by.U on Google Play Store Reviews with TF-IDF and Support Vector Machine (SVM) Method," Scientific Journal of Informatics, vol. 7, no. 2, pp. 2407–7658, 2020. http://journal.unnes.ac.id/nju/index.php/sji
R. Mitchell, Web Scraping with Python: Collecting More Data from the Modern Web (A. MacDonald, Ed.; Second Edition). O’Reilly Media, Inc, 2018.
H. Nguyen, A. Veluchamy, M. L. Diop and R. Iqbal, "Comparative Study of Sentiment Analysis with Product Reviews Using Machine Learning and Lexicon-Based Approaches," SMU Data Science Review, vol. 1, no. 4, 2018. https://scholar.smu.edu/datasciencereviewAvailableat:https://scholar.smu.edu/datasciencereview/vol1/iss4/7http://digitalrepository.smu.edu.
D. A. Fauziah, A. Maududie, and I. Nuritha, "Klasifikasi Berita Politik Menggunakan Algoritma K-nearst Neighbor (Classification of Political News Content using K-Nearest Neighbor)," BERKALA SAINSTEK, vol. 6, no. 2, pp. 106–114, 2018.
B. Titania, PENERAPAN METODE TEXT MINING DAN SOCIAL NETWORK ANALYSIS PADA JEJARING SOSIAL TWITTER, 2020.
A. M. Pravina, I. Cholissodin, and P. P. Adikara, "Analisis Sentimen Tentang Opini Maskapai Penerbangan pada Dokumen Twitter Menggunakan Algoritme Support Vector Machine (SVM)," Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, vol. 3, no. 3, pp. 2789–2797, 2019. http://j-ptiik.ub.ac.id
E. Sonalitha, S. R. Asriningtias, and A. Zubair, Text Mining (Pertama). Graha Ilmu, 2021.
A. S. Girsang, Word Embedding dengan Word2vec, 2020. https://mti.binus.ac.id/2020/11/17/word-embedding-dengan-word2vec/#:~:text=Word%20embeddings%20adalah%20proses%20konversi%20kata%20yang%20berupa,merepresentasikan%20sebuah%20titik%20pada%20space%20dengan%20dimensi%20tertentu (accessed Sep 30, 2022)
H. F. Naufal and E. B. Setiawan, "Ekspansi Fitur Pada Analisis Sentimen Twitter Dengan Pendekatan Metode Word2Vec," E-Proceeding of Engineering, vol. 8, no. 5, pp. 10339, 2021.
J. Pennington, R. Socher, and C. D. Manning, GloVe: Global Vectors for Word Representation, 2014. https://nlp.stanford.edu/projects/glove/ (accessed Sep 20, 2022).
M. D. D. Sreya, and E. B. Setiawan, "Penggunaan Metode GloVe untuk Ekspansi Fitur pada Analisis Sentimen Twitter dengan Naïve Bayes dan Support Vector Machine," E-Proceeding of Engineering, vol. 9, no. 3, 2022.
A. S. Girsang, Word Embedding dengan FastText, 2021. https://mti.binus.ac.id/2021/12/31/word-embedding-dengan-fasttext/#:~:text=Word%20embedding%20menangkap%20informasi%20semantik%20dan%20kata%20sintaksis%2C,oleh%20Facebook%20yang%20dapat%20digunakan%20untuk%20word%20embedding (accessed Sep 20, 2022).
B. Santosa, Data Mining : Terbaik Pemanfaatan Data untuk Keperluan Bisnis (Ed. 1, Cet. 1). Graha Ilmu, 2007.
I. Cholissodin, Sutrisno, A. A. Soebroto, U. Hasanah, and Y. I. Febiola, AI, Machine Learning & Deep Learning, 2019. https://www.researchgate.net/publication/348003841
A. Pratama, Klasifikasi Kondisi Detak Jantung Berdasarkan Hasil Pemeriksaan Elektronikardiografi (EKG) Menggunakan Binary Decision Tree-Support Vector Machin (BDT-SVM), 2016.
K. N. Utami and E. B. Setiawan, "Ekspansi Fitur dengan FastText pada Klasisikadi Topik dengan Metode Naïve Bayes-Support Vector Machine (NBSVM) di Twitter," E-Proceeding of Engineering, vol. 9, no. 3, pp. 1872, 2022. https://t.co/C1SAKZKniG.
Copyright (c) 2024 Margaretha Anjani, Helena Nurramdhani Irmanda
This work is licensed under a Creative Commons Attribution 4.0 International License.