HATE SPEECH DETECTION USING GLOVE WORD EMBEDDING AND GATED RECURRENT UNIT
Abstract
Social media has become a tool that makes it easier for people to exchange information. The freedom to share information has opened the door for increased incidents of hate speech on social media. Hate speech detection is an interesting topic because with the increasing use of social media, hate speech can quickly spread and trigger significant negative impacts, discrimination, and social conflict. This research aims to see the effect of GRU method, GloVe word embedding and word modifier algorithm in detecting hate speech. GRU and GloVe are used in this research for the hate speech detection system, where deep learning with a Gated Recurrent Unit (GRU) and Word Embedding with the Global Vector model (GloVe) converts words in text into numerical vectors that represent the meaning and context of the words. GRU is chosen due to its ability to capture long-term dependencies in textual data with higher computational efficiency compared to Long Short-Term Memory (LSTM). Gated Recurrent Unit (GRU) model processes the sequence of words to understand the sentence structure. GRU model processes the sequence of words to understand the sentence structure. The evaluation results for the classification of hate speech using GRU and GloVe are 90.7% accuracy and 91% F1 score. With the combination of informal word modifier algorithms there is an increase with a value of 92.8% F1 and 92.4% accuracy. in conclusion, the use of informal word modifier algorithms can increase the evaluation value in detecting hate speech.
Downloads
References
J. S. Malik, H. Qiao, G. Pang, and A. van den Hengel, “Deep Learning for Hate Speech Detection: A Comparative Study,” Feb. 2022, [Online]. Available: http://arxiv.org/abs/2202.09517
B. Breve, L. Caruccio, S. Cirillo, V. Deufemia, and G. Polese, “Analyzing the worldwide perception of the Russia-Ukraine conflict through Twitter,” J Big Data, vol. 11, no. 1, Dec. 2024, doi: 10.1186/s40537-024-00921-w.
C. Liu, Y. Tian, Y. Shi, Z. Huang, and Y. Shao, “An analysis of public topics and sentiments based on social media during the COVID-19 Omicron Variant outbreak in Shanghai 2022,” Computational Urban Science, vol. 4, no. 1, Dec. 2024, doi: 10.1007/s43762-024-00128-y.
I. Oluwasegun Adeniyi, N. A. Sande, A. Akinkunmi Author, and I. Oluwasegun, “Social Media Sentiment Analysis: A Comprehensive Analysis”, doi: 10.13140/RG.2.2.31094.37441.
Nasrabadi N, Wicaksono H, and Valilai O, “Shopping marketplace analysis based on customer insights using social media analytics,” MethodsX, vol. 9, Jan. 2022, doi: 10.1016/j.mex.2022.101932.
Q. A. B. K. Zaman, W. N. S. B. W. Yusoff, and Q. B. B. A. Shah, “Sentiment Analysis on The Place of Interest in Malaysia,” Journal of Advanced Research in Applied Sciences and Engineering Technology, vol. 43, no. 1, pp. 54–65, Jan. 2025, doi: 10.37934/araset.43.1.5465.
A. Müller and M. Lopez-Sanchez, “Countering Negative Effects of Hate Speech in a Multi-Agent Society,” in Frontiers in Artificial Intelligence and Applications, IOS Press BV, Oct. 2021, pp. 103–112. doi: 10.3233/FAIA210122.
J. Forry Kusuma and A. Chowanda, “Indonesian Hate Speech Detection Using IndoBERTweet and BiLSTM on Twitter,” 2020. [Online]. Available: www.joiv.org/index.php/joiv
F. Nadia Puteri and Y. Sibaroni, “Hate Speech Detection in Indonesia Twitter Comments Using Convolutional Neural Network (CNN) and FastText Word Embedding,” vol. 7, no. 3, pp. 1154–1161, 2023, doi: 10.30865/mib.v7i3.6401.
M. Hayaty, A. D. Laksito, and S. Adi, “Hate speech detection on Indonesian text using word embedding method-global vector,” IAES International Journal of Artificial Intelligence, vol. 12, no. 4, pp. 1928–1937, Dec. 2023, doi: 10.11591/ijai.v12.i4.pp1928-1937.
I. Zulfikar, M. Nasrun, S. Si, and C. Setianingsih, “Deteksi Ujaran Kebencian Menggunakan Algoritma Glove Dan Deep Belief Network (Dbn).” Universitas Telkom, 2019.
N. Badri, F. Kboubi, and A. H. Chaibi, “Combining FastText and Glove Word Embedding for Offensive and Hate speech Text Detection,” in Procedia Computer Science, Elsevier B.V., 2022, pp. 769–778. doi: 10.1016/j.procs.2022.09.132.
R. Rana, “Gated Recurrent Unit (GRU) for Emotion Classification from Noisy Speech,” Dec. 2016, [Online]. Available: http://arxiv.org/abs/1612.07778
M. Zulqarnain, R. Ghazali, Y. M. M. Hassim, and M. Rehan, “Text classification based on gated recurrent unit combines with support vector machine,” International Journal of Electrical and Computer Engineering, vol. 10, no. 4, pp. 3734–3742, 2020, doi: 10.11591/ijece.v10i4.pp3734-3742.
J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global Vectors for Word Representation.”https://nlp.stanford.edu/projects/glove (accessed: Apr. 01, 2024).
M. Devansh, H. YiDong, Alves de Oliveira, and Thiago Eustaquio, “A Curated Hate Speech Dataset,” 2022.
A. Toosi, “Twitter Sentiment Analysis.” Accessed: Apr. 25, 2024. [Online]. Available: https://www.kaggle.com/datasets/arkhoshghalb/twitter-sentiment-analysis-hatred-speech/
D. Putri et al., “Hate Speech Detection on Twitter Approaching The Indonesian Election Using Machine Learning,” Universitas Indonesia, 2018.
J. Patihullah and E. Winarko, “Hate Speech Detection for Indonesia Tweets Using Word Embedding And Gated Recurrent Unit,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 13, no. 1, p. 43, Jan. 2019, doi: 10.22146/ijccs.40125.
H. Imaduddin, L. A. Kusumaningtias, and F. Y. A’la, “Application of LSTM and GloVe Word Embedding for Hate Speech Detection in Indonesian Twitter Data,” Ingénierie des systèmes d information, vol. 28, no. 4, pp. 1107–1112, Aug. 2023, doi: 10.18280/isi.280430.
A. Ahmad Aliero, B. Sulaimon Adebayo, H. Olanrewaju Aliyu, A. Gogo Tafida, B. Umar Kangiwa, and N. Muhammad Dankolo, “Systematic Review on Text Normalization Techniques and its Approach to Non-Standard Words,” 2023.
J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global Vectors for Word Representation.”https://nlp.stanford.edu/data/glove.6B.zip(accessed: Apr. 01, 2024)
A. Rahmadeyan and Mustakim, “Long Short-Term Memory and Gated Recurrent Unit for Stock Price Prediction,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 204–212. doi: 10.1016/j.procs.2024.02.167.
R. Achmad, Y. Tokoro, J. Haurissa, and A. Wijanarko, “Recurrent Neural Network-Gated Recurrent Unit for Indonesia-Sentani Papua Machine Translation,” Journal of Information Systems and Informatics, vol. 5, no. 4, pp. 1449–1460, Dec. 2023, doi: 10.51519/journalisi.v5i4.597.
S. Manna, “K-Fold Cross Validation for Deep Learning Models using Keras.” Accessed: Jul. 11, 2024. [Online]. Available: https://medium.com/the-owl/k-fold-cross-validation-in-keras-3ec4a3a00538
A. M. Peco Chacón, I. Segovia Ramírez, and F. P. García Márquez, “K-nearest neighbour and K-fold cross-validation used in wind turbines for false alarm detection,” Sustainable Futures, vol. 6, Dec. 2023, doi: 10.1016/j.sftr.2023.100132.
M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y. Alzahrani, and R. Budiarto, “Evaluating Trust Prediction and Confusion Matrix Measures for Web Services Ranking,” IEEE Access, vol. 8, pp. 90847–90861, 2020, doi: 10.1109/ACCESS.2020.2994222.
Copyright (c) 2024 Aulia Riefqi Ardana, Yuliant Sibaroni

This work is licensed under a Creative Commons Attribution 4.0 International License.