SVM OPTIMIZATION WITH INFORMATION GAIN FEATURE SELECTION TO INCREASE THE ACCURACY OF SENTIMENT ANALYSIS OF INCREASING THE COST OF THE HAJJ
Abstract
Everyone's freedom to express their opinions is now poured into a platform known as social media. This platform allows people in the digital world to communicate with each other using the internet. YouTube is one of the most popular social media platforms worldwide. In 2023, the Government, in this case the Ministry of Religious Affairs of the Republic of Indonesia and Commission VIII of the House of Representatives have approved the Hajj Travel Cost 1444 H/2023 AD with a range of Rp90,050,637.26 per regular pilgrim. In contrast to the government of the Kingdom of Saudi Arabia, which implemented a policy of reducing the cost of the Hajj package by 30% from 2022. This has caused pros and cons to the hajj cost increase. Public opinion on social media is the focus of this research to conduct sentiment analysis. Sentiment analysis has been developed through various methods, but there are still many challenges to produce accurate sentiment analysis. The challenges include accuracy, binary classification, data sparsity, and polarity shift. One of the challenges in improving accuracy is the focus of this research. In this study, the Support Vector Machine method is applied and Information Gain feature selection is added. The accuracy results obtained in this study are the Support Vector Machine method (87%) and Support Vector Machine combine with information gain feature selection (89%). It can be concluded, the support vector machine method combined with information gain feature selection proves an increase in accuracy by 2%.
Downloads
References
R. Asrianto and M. Herwinanda, “Analisis sentimen kenaikan harga kebutuhan pokok dimedia sosial youtube menggunakan algoritma support vector machine,” J. CoSciTech (Computer Sci. Inf. Technol., vol. 3, no. 3, pp. 431–440, Dec. 2022, doi: 10.37859/coscitech.v3i3.4368.
katadata.co.id, “Facebook hingga Twitter, Ini Deretan Media Sosial Terpopuler Dunia di Awal 2023,” 2023. Accessed: Feb. 06, 2023. [Online]. Available: https://databoks.katadata.co.id/datapublish/2023/02/06/facebook-hingga-twitter-ini-deretan-media-sosial-terpopuler-dunia-di-awal-2023
Kemenag, “Keputusan Menteri Agama No 352 Tahun 2023 tentang Biaya Perjalanan Ibadah Haji (Bipih) Reguler 1444 H dan Penggunaan Nilai Manfaat,” 2023. https://kemenag.go.id/informasi/keputusan-menteri-agama-no-352-tahun-2023-tentang-biaya-perjalanan-ibadah-haji--bipih--reguler-1444-h-dan-penggunaan-nilai-manfaat
D. M. Y. Sinurat, D. E. Ratnawati, and D. W. Brata, “Analisis Sentimen Terhadap Kenaikan Cukai Rokok pada Media Sosial Twitter menggunakan Algoritma Naïve Bayes Classifier,” 2023. [Online]. Available: http://j-ptiik.ub.ac.id
F. V. Sari and A. Wibowo, “Analisis Sentimen Pelanggan Toko Online JD.ID Menggunakan Metode Naive Bayes Classifier Berbasis Konversi Ikon Emosi,” J. SIMETRIS, vol. 10, no. 2, 2019.
R. K. Dey, D. Sarddar, I. Sarkar, R. Bose, and S. Roy, “Techniques Involving Social Media And Online Platforms,” Int. J. Sci. Technol. Res., vol. 9, no. 05, 2020.
A. Giachanou and F. Crestani, “Like it or not: A survey of Twitter Sentiment Analysis Methods,” ACM Computing Surveys, vol. 49, no. 2. Association for Computing Machinery, Jun. 01, 2016. doi: 10.1145/2938640.
D. Juhaeni and A. Wibowo, “Penerapan Metode Naïve Bayes Untuk Wacana Kenaikan Harga Tiket Candi Borobudur Pada Twitter,” 2022.
S. Nurhaliza, Y. Yusra, and M. Fikry, “Klasifikasi Sentimen Masyarakat di Twitter Terhadap Kenaikan Harga BBM dengan Metode Support Vector Machine,” J. Sist. Komput. dan Inform., vol. 4, no. 4, p. 586, Jul. 2023, doi: 10.30865/json.v4i4.6322.
R. Savira, A. Solichin, and M. Syafrullah, “Analisis Sentimen Pada Twitter Terhadap KenaikanBBM 2022 Dengan Lexicon dan Support Vector Machine,” Jakarta, 2023.
A. S. H. Basari, B. Hussin, I. G. P. Ananta, and J. Zeniarja, “Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization,” in Procedia Engineering, Elsevier Ltd, 2013, pp. 453–462. doi: 10.1016/j.proeng.2013.02.059.
L. B. Ilmawan and M. A. Mude, “Perbandingan Metode Klasifikasi Support Vector Machine dan Naïve Bayes untuk Analisis Sentimen pada Ulasan Tekstual di Google Play Store,” Ilk. J. Ilm., vol. 12, no. 2, pp. 154–161, 2020, doi: 10.33096/ilkom.v12i2.597.154-161.
W. Athira Luqyana, I. Cholissodin, and R. S. Perdana, “Analisis Sentimen Cyberbullying pada Komentar Instagram dengan Metode Klasifikasi Support Vector Machine,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 2, no. 11, pp. 4704–4713, 2018, [Online]. Available: http://j-ptiik.ub.ac.id
D. Wang and Y. Zhao, “Using News to Predict Investor Sentiment: Based on SVM Model,” Procedia Comput. Sci., vol. 174, no. 2019, pp. 191–199, 2020, doi: 10.1016/j.procs.2020.06.074.
F. D. Ananda and Y. Pristyanto, “Analisis Sentimen Pengguna Twitter Terhadap Layanan Internet Provider Menggunakan Algoritma Support Vector Machine,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 20, no. 2, pp. 407–416, May 2021, doi: 10.30812/matrik.v20i2.1130.
P. Arsi, R. Wahyudi, and R. Waluyo, “Optimasi SVM Berbasis PSO pada Analisis Sentimen Wacana Pindah Ibu Kota Indonesia,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 2, pp. 231–237, 2021, doi: 10.29207/resti.v5i2.2698.
L. P. Hung, R. Alfred, and M. H. A. Hijazi, “A review on feature selection methods for sentiment analysis,” Advanced Science Letters, vol. 21, no. 10. American Scientific Publishers, pp. 2952–2956, Oct. 01, 2015. doi: 10.1166/asl.2015.6475.
V. Chandani and R. S. Wahono, “Komparasi Algoritma Klasifikasi Machine Learning Dan Feature Selection pada Analisis Sentimen Review Film,” J. Intell. Syst., vol. 1, no. 1, 2015, [Online]. Available: http://journal.ilmukomputer.org
A. B. P. Negara, H. Muhardi, and I. M. Putri, “Analisis Sentimen Maskapai Penerbangan Menggunakan Metode Naive Bayes dan Seleksi Fitur Information Gain,” vol. 7, no. 3, pp. 599–606, 2020, doi: 10.25126/jtiik.202071947.
M. Arhami and M. Nasir, Data Mining Algoritma dan Implementasi, 1st ed. Yogyakarta: Andi, 2020.
Ratih Puspitasari, Y. Findawati, and M. A. Rosid, “SENTIMENT ANALYSIS OF POST-COVID-19 INFLATION BASED ON TWITTER USING THE K-NEAREST NEIGHBOR AND SUPPORT VECTOR MACHINE CLASSIFICATION METHODS,” J. Tek. Inform., vol. 4, no. 4, pp. 669–679, Aug. 2023, doi: 10.52436/1.jutif.2023.4.4.801.
F. A. Sianturi, P. M. Hasugian, A. Simangunsong, and B. Nadeak, Data Mining : Teori dan Aplikasi Weka, 1st ed. CV. Rudang Mayang, 2019.
D. A. Pisner and D. M. Schnyer, Support vector machine. Elsevier Inc., 2020. doi: 10.1016/B978-0-12-815739-8.00006-7.
H. Mubarok, I. Ernawati, and N. Chamidah, “Optimasi Algoritma Support Vector Machine Menggunakan Seleksi Fitur Particle Swarm Optimization Pada Analisis Sentimen Terhadap Kebijakan PPKM,” pp. 1–54, 2022.
I. S. Ahmad, A. A. Bakar, and M. R. Yaakub, “A Review Of Feature Selection in Sentiment Analysis Using Information Gain and Domain Specific Ontology,” Int. J. Adv. Comput. Res., vol. 9, no. 44, pp. 283–292, Sep. 2019, doi: 10.19101/ijacr.pid90.
G. Wu and J. Xu, “Optimized Approach of Feature Selection based on Information Gain,” in Proceedings - 2015 International Conference on Computer Science and Mechanical Automation, CSMA 2015, Institute of Electrical and Electronics Engineers Inc., Jan. 2016, pp. 157–161. doi: 10.1109/CSMA.2015.38.
D. Musfiroh, U. Khaira, P. E. P. Utomo, and T. Suratno, “Analisis Sentimen terhadap Perkuliahan Daring di Indonesia dari Twitter Dataset Menggunakan InSet Lexicon,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 1, no. 1, pp. 24–33, 2021, doi: 10.57152/malcom.v1i1.20.
I. K. A. B. Artana, G. A. Pradnyana, and I. G. M. Darmawiguna, “ANALISIS SENTIMEN TWITTER UNTUK MENILAI KESIAPAN PEMBELAJARAN TATAP MUKA TERBATAS DENGAN INSET LEXICON DAN LEVENSHTEIN DISTANCE,” J. Pendidik. Teknol. dan Kejuru., vol. 20, no. 2, 2023.
F. Koto and G. Y. Rahmaningtyas, “InSet Lexicon: Evaluation of a Word List for Indonesian Sentiment Analysis in Microblogs,” pp. 391–394, 2017, doi: 10.1109/IALP.2017.8300625.
S. Visa, A. Inoue, and A. Ralescu, “Proceedings of the Twenty-second Midwest Artificial Intelligence and Cognitive Science Conference,” 2011.
E. Prasetyowati, Data Mining : Pengelompokan Data untuk Informasi dan Evaluasi. Pamekasan: Duta Media Publishing, 2017.
E. A. Novia, W. I. Rahayu, and C. Prianto, Sistem Perbandingan Algoritma K-Means Dan Naive Bayes Untuk Memprediksi Prioritas Pembayaran Tagihan Rumah Sakit Berdasarkan Tingkat Kepentingan. Bandung: Kreatif Industri Nusantara, 2020.
Copyright (c) 2024 MANARUL HIDAYAT, ARIEF WIBOWO
This work is licensed under a Creative Commons Attribution 4.0 International License.