OPTIMIZING SENTIMENT ANALYSIS OF PRODUCT REVIEWS ON MARKETPLACE USING A COMBINATION OF PREPROCESSING TECHNIQUES, WORD2VEC, AND CONVOLUTIONAL NEURAL NETWORK

  • Fahry Maodah Magister Teknik Informatika, Universitas Amikom Yogyakarta, Indonesia
  • Ema Utami Magister Teknik Informatika, Universitas Amikom Yogyakarta, Indonesia
  • Sudarmawan Magister Teknik Informatika, Universitas Amikom Yogyakarta, Indonesia
Keywords: CNN, preprocessing, sentimen, Product Reviews, Word2vec

Abstract

This research attempts to identify the most accurate and effective model in performing sentiment analysis on product reviews in marketplaces using preprocessing techniques, word2vec, and CNN. We collected 20,986 reviews from 720 products in a marketplace using scrap method, then cleaned and labeled the data to include 515 positive reviews, 490 negative reviews. We then performed preprocessing on the data using four different scenarios and identified word vector representation using word2vec. Subsequently, we applied the results of word2vec to the CNN architecture to classify sentiment in product reviews. After trying various variations of each technique, we found that a combination of the third preprocessing technique (case folding, punctuation removal, word normalization, and stemming), the second word2vec parameter combination (size 50, window 2, hs 0, and negative 10), and the fourth CNN parameter combination (kernel size 2, dropout 0.2, and learning rate 0.01) had the best accuracy of 99.00%, precision of 98.96%, and recall of 98.96%. We also found that the word normalization technique greatly helped to increase model accuracy by correcting improperly written or incorrect words in the reviews. Based on the evaluation of word2vec, the hs 0 method produced a higher average accuracy compared to the hs 1 method because the hs 0 method used negative sampling which helped the model understand the context of the trained words. In the CNN parameter, higher learning rates can cause the model to learn faster, but can also cause the model to be unstable, while lower learning rates can make the model more stable but can also cause the model's learning process to be slower.

Downloads

Download data is not yet available.

References

I. R. Servanda, P. R. K. Sari, dan N. A. Ananda, “Peran Ulasan Produk dan Foto Produk yang Ditampilkan Penjual pada Marketplace Shopee Terhadap Minat Beli Pria dan Wanita,” Jurnal Manajemen dan Bisnis, vol. 2, hlm. 69–79, 2019.

M. Nurul, N. Soewarno, dan Isnalita, “Pengaruh Jumlah Pengunjung, Ulasan Produk, Reputasi Toko Dan Status Gold Badge pada Penjualan Dalam Tokopedia,” e-Jurnal Akuntansi, vol. 28, hlm. 1855–1865, Sep 2019.

L. S. W. W. Keaan, Indriati, dan Marji, “Analisis Sentimen Review Shopee Berbahasa Indonesia Menggunakan Improved K-Nearest Neighbor dan Jaro Winkler Distance,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 3, hlm. 7172–7179, Jul 2019.

E. H. Muktafin, Kusrini, dan E. T. Luthfi, “Analisis Sentimen pada Ulasan Pembelian Produk di Marketplace Shopee Menggunakan Pendekatan Natural Language Processing,” Jurnal Eksplora Informatika, vol. 10, hlm. 32–42, Sep 2020.

A. H. Ombabi, O. Lazzez, W. Ouarda, dan A. M. Alimi, “Deep Learning Framework based on Word2Vec and CNN for Users Interests Classification,” Sudan Conference on Computer Science and Information Technology (SCCSIT), 2017.

M. R. Aldiansyah dan P. S. Sasongko, “Twitter Sentiment Analysis About Public Opinion on 4G Smartfren Network Services Using Convolutional Neural Network,” International Conference on Informatics and Computational Sciences (ICICoS), vol. 3, hlm. 1–6, 2019.

S. Smetanin dan M. Komarov, “Sentiment Analysis of Product Reviews in Russian using Convolutional Neural Networks,” Conference on Business Informatics (CBI), vol. 21, hlm. 482–486, 2019.

R. S., Arthi.R, S. Murugan, dan Julie, “Topic categorization of Tamil News Articles using PreTrained Word2Vec Embeddings with Convolutional Neural Network,” IEEE International Conference on Computational Intelligence for Smart Power System and Sustainable Energi (CISPSSE), hlm. 29–31, Jul 2020.

A. Nurdin, B. A. S. Aji, A. Bustamin, dan Z. Abidin, “Perbandingan Kinerja Word Embedding Word2vec, Glove, dan Fasttext pada Klasifikasi Teks,” Jurnal TEKNOKOMPAK, vol. 14, hlm. 74–79, 2020.

V. Krotov, L. R. Johnson, dan L. Silva, “Tutorial: Legality and Ethics of Web Scraping,” Communications of the Association for Information Systems, vol. 47, hlm. 555–581, Agu 2020.

I. R. Hendrawan, E. Utami, dan A. D. Hartanto, “Comparison of Naïve Bayes Algorithm and XGBoost on Local Product Review Text Classification,” Edumatic: Jurnal Pendidikan Informatika, vol. 6, hlm. 143–149, Jun 2022.

R. Wagh dan P. Punde, “Survey on Sentiment Analysis using Twitter Dataset,” International conference on Electronics, Communication and Aerospace Technology (ICECA), vol. 2, hlm. 208–211, 2018.

A. N. Rohman, R. L. Musyarofah, E. Utami, dan S. Raharjo, “Natural Language Processing on Marketplace Product Review Sentiment Analysis,” International Conference on Cybernetics and Intelligent System (ICORIS), vol. 2, 2020.

T. Bratanic, “Complete guide to understanding Node2Vec algorithm,” Medium, 2021. https://towardsdatascience.com/complete-guide-to-understanding-node2vec-algorithm-4e9a35e5d147 (diakses Des 12, 2022).

Y. Zhang dan B. C. Wallace, “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification,” The University of Texas, 2016.

H. Juwiantho, E. I. Setiawan, J. Santoso, dan M. H. Purnomo, “Sentiment Analysis Twitter Bahasa Indonesia Berbasis Word2vec Menggunakan Deep CNN,” Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), vol. 7, hlm. 181–188, Feb 2020.

L. Yang, Y. Li, J. Wang, dan R. S. Sherratt, “Sentiment Analysis for E-Commerce Product Reviews in Chinese Based on Sentiment Lexicon and Deep Learning,” IEEE Access, vol. 8, hlm. 23522–23530, Jan 2020.

Y. Luan dan S. Lin, “Research on Text Classification Based on CNN and LSTM,” IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), hlm. 352–355, Mar 2019.

Published
2023-02-10
How to Cite
[1]
F. Maodah, E. Utami, and S. Sudarmawan, “OPTIMIZING SENTIMENT ANALYSIS OF PRODUCT REVIEWS ON MARKETPLACE USING A COMBINATION OF PREPROCESSING TECHNIQUES, WORD2VEC, AND CONVOLUTIONAL NEURAL NETWORK”, J. Tek. Inform. (JUTIF), vol. 4, no. 1, pp. 101-107, Feb. 2023.

Most read articles by the same author(s)