PERSONALITY DETECTION ON TWITTER USER USING XGBOOST ALGORITHM
Abstract
Personality is a person's identity that is addressed to the public. The Big Five personality is the most commonly used personality model. Detecting a person's personality is still a difficult task today. Because personality detection still often requires humans to fill out lengthy questionnaires to evaluate various personality traits. Therefore, a system that is able to identify personality easily and specifically is needed. By using social media, individuals often express their feelings. Twitter is the most popular social networking platform today. In this research, we use the XGBoost Algorithm, a powerful machine learning method, to create a personality detection system that improves upon existing approaches. Our research aims to determine how well the XGBoost algorithm can recognize Big Five personality features in Twitter users. We achieved encouraging results through in-depth investigation and experimentation. The XGBoost algorithm successfully developed a model that can recognize all Big Five personality trait labels but with different precision, recall and f1-score values. The highest value was obtained for the Extroversion label with a precision of 0.92, recall of 1.00 and f1-score of 0.96. Meanwhile, the lowest value is owned by the Agreeableness label with a precision value of 0.29, recall 0.29, and f1-score of 0.29. This research demonstrates the potential of the XGBoost Algorithm for personality discovery on social media platforms, providing a fast and accurate method to identify distinctive characteristics. Overall, the results of this study demonstrate the efficiency of the XGBoost Algorithm in the context of personality recognition, opening the door for further development in understanding and evaluating human behavior through social media platforms such as Twitter.
Downloads
References
N. Fatwikiningsih, Teori Psikologi Kepribadian Manusia. Yogyakarta: CV. ANDI OFFSET, 2020. [Online]. Available: https://books.google.co.id/books?hl=en&lr=&id=UCn-DwAAQBAJ&oi=fnd&pg=PP1&dq=kepribadian&ots=Xu3giO5q_z&sig=QibB7ZyqLRO4E226VLKxPGwjGVQ&redir_esc=y#v=onepage&q=kepribadian&f=false
T. A. Widiger and C. Crego, “The Five Factor Model of personality structure: an update,” World Psychiatry, vol. 18, no. 3, pp. 271–272, 2019, doi: 10.1002/wps.20658.
S. A. Utami, N. Grasiaswaty, and S. Z. Akmal, “Hubungan Tipe Kepribadian Berdasarkan Big Five Theory Personality dengan Kebimbangan Karier pada Siswa SMA Relationship between Types of Personality Based on Big Five Theory Personality with Career Indecision among High School Students,” vol. 6, no. 1, 2018.
V. Febriyanti, N. Eva, and S. Andayani, “Tingkat kesejahteraan psikologis ditinjau dari tipe kepribadian big five psychological well-being level based on big five personality type,” Psycho Idea, vol. 20, no. 2, pp. 141–152, 2022.
F. Celli and B. Lepri, “Is Big Five better than MBTI?,” Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018, vol. 2018, no. December 2018, pp. 93–98, 2019, doi: 10.4000/books.aaccademia.3147.
G. Alderotti, C. Rapallini, and S. Traverso, “The Big Five personality traits and earnings: A meta-analysis,” J Econ Psychol, no. October, p. 102570, 2022, doi: 10.1016/j.joep.2022.102570.
S. Berkovsky et al., “Detecting personality traits using eye-tracking data,” Conference on Human Factors in Computing Systems - Proceedings, pp. 1–12, 2019, doi: 10.1145/3290605.3300451.
S. V. Therik, E. B. Setiawan, and U. Telkom, “Deteksi Kepribadian Big Five Pengguna Twitter,” eProceedings of Engineering, vol. 8, no. 5, pp. 10277–10287, 2021.
N. Han et al., “How social media expression can reveal personality,” Front Psychiatry, vol. 14, no. March, pp. 1–12, Mar. 2023, doi: 10.3389/fpsyt.2023.1052844.
D. Ruby, “62 Twitter Statistics In 2023 — (Users, Revenue & Trends),” Demand Sage, 2023. https://www.demandsage.com/twitter-statistics/
S. Kemp, “DIGITAL 2022: INDONESIA,” Data Reportal, 2022. https://datareportal.com/reports/digital-2022-indonesia
A. Angsaweni and W. Maharani, “Identification of Big Five Personality on Twitter Users using the AdaBoost Method,” Building of Informatics, Technology and Science (BITS), vol. 4, no. 2, pp. 377–383, 2022, doi: 10.47065/bits.v4i2.1853.
K. F. Lydia and E. B. Setiawan, “Sistem Prediksi Kepribadian DISC Pengguna Twitter Dengan Algoritma Support Vector Machine ( SVM ) Menggunakan Metode Pembobotan TF-IDF Dan ANP,” 2019.
R. P. Pratama and W. Maharani, “Predicting Big Five Personality Traits Based on Twitter User Using Random Forest Method,” in 2021 International Conference on Data Science and Its Applications (ICoDSA), 2021, pp. 110–117.
M. K. Nasution, Rd. R. Saedudin, and V. P. Widartha, “Perbandingan Akurasi Algoritma Naïve Bayes Dan Algoritma Xgboost Pada Klasifikasi Penyakit Diabetes,” e-Proceeding of Engineering, vol. 8, no. 5, pp. 9765–9772, 2021, [Online]. Available: https://journal.ubpkarawang.ac.id/mahasiswa/index.php/ssj/article/view/424/338%0Ahttps://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15759
M. R. Kurniawanda and F. A. T. Tobing, “Analysis Sentiment Cyberbullying In Instagram Comments with XGBoost Method,” IJNMT (International Journal of New Media Technology), vol. 9, no. 1, pp. 28–34, 2022, doi: 10.31937/ijnmt.v9i1.2670.
Z. Qi, “The Text Classification of Theft Crime Based on TF-IDF and XGBoost Model,” Proceedings of 2020 IEEE International Conference on Artificial Intelligence and Computer Applications, ICAICA 2020, pp. 1241–1246, 2020, doi: 10.1109/ICAICA50127.2020.9182555.
B. Quinto, Next-Generation Machine Learning with Spark. 2020. doi: 10.1007/978-1-4842-5669-5.
Xgb. Developer, “Release 1.7.3 xgboost developers.” 2023. [Online]. Available: https://xgboost.readthedocs.io/en/stable/get_started.html
Y. Wang and X. S. Ni, “A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization,” International Journal of Database Management Systems (IJDMS), vol. 11, no. 1, pp. 243–250, Jan. 2019, doi: https://doi.org/10.48550/.
H. Jahanshahi et al., “Text Classification for Predicting Multi-level Product Categories,” in 31st Annual International Conference on Computer Science and Software Engineering, 2021, pp. 33–42. [Online]. Available: http://arxiv.org/abs/2109.01084
W. Nugraha and A. Sasongko, “Hyperparameter Tuning pada Algoritma Klasifikasi dengan Grid Search,” SISTEMASI : Jurnal Sistem Informasi, vol. 11, no. 2, pp. 391–401, 2022.
Z. Wang, C. Wu, K. Zheng, X. Niu, and X. Wang, “SMOTETomek-Based Resampling for Personality Recognition,” IEEE Access, vol. 7, pp. 129678–129689, 2019, doi: 10.1109/ACCESS.2019.2940061.
G. Shobha and S. Rangaswamy, Chapter 8 Machine Learning, 1st ed., vol. 38. Elsevier B.V., 2018. doi: 10.1016/bs.host.2018.07.004.
Copyright (c) 2024 Adinda Putri Rosyadi, Warih Maharani, Prati Hutari Gani
This work is licensed under a Creative Commons Attribution 4.0 International License.