COMPARISON OF CLASSIFICATION ALGORITHM AND FEATURE SELECTION IN BITCOIN SENTIMENT ANALYSIS
Abstract
Sentiment analysis is a process for extracting data in the form of textual, with the aim of obtaining information about the tendency to evaluate an object under study. Sentiments given by the general public can be used as a reference in making product decisions. Sentiment given can be in the form of positive, negative and neutral sentiments. One of the information technology products that has stolen enough attention in the last decade is Bitcoin. The purpose of this study is to compare several classification algorithms using Feature Selection. There are several classification algorithms that can be used for sentiment analysis, such as Deep Learning, Decission Tree, KNN, Naïve Bayes. Textual sentiment classification has constraints on datasets that have high dimensions. Feature Selection is a solution to reduce the dimensions of a dataset by reducing attributes that are less relevant. Feature Selection used is Information Gain and Chi Square. The method used to perform the comparison is by comparing the four classification algorithms to find the best algorithm, then comparing the Feature Selection to get the best between the two, then integrating the best classification algorithm and the best Feature Selection. The results showed that the best classification algorithm was Deep Learning with an accuracy value of 78.43% and a kappa of 0.625. The results of the comparison of Feature Selection, Information Gain get the best results with an average accuracy value of 63.79% and an average kappa of 0.382. The results of the integration of the best classification algorithm with the best Featrure Selection obtained an accuracy value of 78.63% and a kappa of 0.626 where the value was included in the Fair Classification category.
Downloads
References
I. T. Julianto, D. Kurniadi, M. R. Nashrulloh, and A. Mulyani, “Comparison Of Data Mining Algorithm For Forecasting Bitcoin Crypto Currency Trends,” JUTIF, vol. 3, no. 2, pp. 245–248, 2022.
K. Fatmawati and A. P. Windarto, “Data Mining : Penerapan Rapidminer Dengan K-Means Cluster Pada Daerah Terjangkit Demam Berdarah Dengue ( DBD ) Berdasarkan Provinsi,” CESS (Journal Comput. Eng. Syst. Sci., vol. 3, no. 2, pp. 173–178, 2018.
D. T. Larose and C. D. Larose, DISCOVERING KNOWLEDGE IN DATA An Introduction to Data Mining Second Edition Wiley Series on Methods and Applications in Data Mining. Canada: John Wiley & Sons, Inc, 2014.
A. P. Singh and S. Malani, “Understanding and Predicting Trends In Cryptocurrency Prices Using Data Mining Techniques,” IIIT Hyderabad, pp. 1–7, 2018.
R. C. Noorsanti, H. Yulianton, and K. Hadiono, “Blockchain - Teknologi Mata Uang Cryptocurrency,” Pros. SENDI_U 2018, pp. 306–311, 2018.
D. S. Utami and A. Erfina, “Analisis Sentimen Pinjaman Online di Twitter Menggunakan Algoritma Support Vector Machine (SVM),” SISMATIK (Seminar Nas. Sist. Inf. dan Manaj. Inform., vol. 1, no. 1, pp. 299–305, 2021.
R. Parlika, S. I. Pradika, A. M. Hakim, and K. R. N. M, “Analisis Sentimen Twitter Terhadap Bitcoin dan Cryptocurrency Berbasis Python TextBlob,” J. Ilm. Teknol. Inf. dan Robot., vol. 2, no. 2, pp. 33–37, 2020.
J. Ipmawati, Kusrini, and E. Taufiq Luthfi, “Komparasi Teknik Klasifikasi Teks Mining Pada Analisis Sentimen,” Indones. J. Netw. Secur., vol. 6, no. 1, pp. 28–36, 2017.
I. M. B. Adnyana, “Penerapan Feature Selection untuk Prediksi Lama Studi Mahasiswa,” J. Sist. dan Inform., vol. 13, no. 2, pp. 72–76, 2019.
O. Somantri and M. Khambali, “Feature Selection Klasifikasi Kategori Cerita Pendek Menggunakan Naïve Bayes dan Algoritme Genetika,” J. Nas. Tek. Elektro dan Teknol. Inf., vol. 6, no. 3, pp. 301–306, 2017, doi: 10.22146/jnteti.v6i3.332.
Nurfaizah, N. Hermanto, and Y. I. Romadon, “Seleksi Fitur Information Gain Dan Algoritma Naïve Bayes Untuk Review Opini Konsumen,” Comput. Based Inf. Syst. J., vol. 8, no. 2, pp. 55–59, 2020.
A. Y. Pratama, Y. Umaidah, and A. Voutama, “Analisis Sentimen Media Sosial Twitter Dengan Algoritma K-Nearest Neighbor dan Seleksi Fitur Chi-Square (Kasus Omnibus Law Cipta Kerja),” Sains Komput. Inform., vol. 5, no. 2, pp. 897–910, 2021, [Online]. Available: https://tunasbangsa.ac.id/ejurnal/index.php/jsakti/article/view/386/365.
I. Zulfa and E. Winarko, “Sentimen Analisis Tweet Berbahasa Indonesia Dengan Deep Belief Network,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 11, no. 2, p. 187, 2017, doi: 10.22146/ijccs.24716.
N. Tri Romadloni, I. Santoso, and S. Budilaksono, “Perbandingan Metode Naive Bayes, KNN, dan Decision Tree Terhadap Analisis Sentimen Transportasi KRL Commuter Line,” J. IKRA-ITH Inform., vol. 3, no. 2, pp. 1–9, 2019.
Nuryani and D. Mahayana, “Analisis Sentimen Berbasis Aspek dengan Deep Learning Ditinjau dari Sudut Pandang Filsafat Ilmu,” Jumanji, vol. 4, no. 2, pp. 70–85, 2021.
A. Firdaus and W. I. Firdaus, “Text Mining Dan Pola Algoritma Dalam Penyelesaian Masalah Informasi : (Sebuah Ulasan),” J. JUPITER, vol. 13, no. 1, pp. 66–78, 2021, [Online]. Available: https://jurnal.polsri.ac.id/index.php/jupiter/article/view/3249/1396.
C. D. Manning, P. Raghavan, and H. Schütze, An Introduction to Information Retrieval (2nd edition). Cambridge: Cambridge University Press, 2009.
Han and Kamber, Data Mining Concepts and Technique. San Francisco: Diane Cerra, 2006.
I. H. Witten, E. Frank, and M. A. Hall, Data Mining Practical Machine Learning Tools and Technique. San Francisco: Morgan Kaufmann, 2011.
F. Rahutomo, A. Retno, T. Hayati, and P. N. Malang, “Evaluasi daftar stopword bahasa indonesia,” vol. 6, no. 1, 2019, doi: 10.25126/jtiik.201861226.
L. K. Harsono, Y. Alkhalifi, Nurajijah, and W. Gata, “Analisis Sentimen Stakeholder atas Layanan haiDJPb pada Media Sosial Twitter Dengan Menggunakan Metode Support Vector Machine dan Naïve Bayes,” J. Ilmu-ilmu Inform. dan Manaj., vol. 14, no. 1, pp. 36–44, 2020.
D. Nurlaela, “Penerapan Adaboost untuk Meningkatkan Akurasi Naive Bayes Pada Prediksi Pendapatan Penjualan Film,” Inti Nusa Mandiri, vol. 14, no. 2, pp. 181–188, 2020.
A. Rahmansyah, O. Dewi, P. Andini, T. Hastuti, P. Ningrum, and M. E. Suryana, “Membandingkan Pengaruh Feature Selection Terhadap Algoritma Naïve Bayes dan Support Vector Machine,” Semin. Nas. Apl. Teknol. Inf., pp. 1907–5022, 2018
Copyright (c) 2022 Indri Tri Julianto
This work is licensed under a Creative Commons Attribution 4.0 International License.