IMPLEMENTATION OF DEEP LEARNING FOR DETECTING PHISHING ATTACKS ON WEBSITES WITH COMBINATION OF CNN AND LSTM
Abstract
Phishing attacks represent significant cyber threats to internet users, particularly on websites. These attacks are conducted by perpetrators seeking to acquire victims' data by impersonating legitimate websites. To address this threat, a solution is proposed using deep learning with a combined algorithm of convolutional neural network and long short-term memory. The research methodology included data collection comprising phishing and legitimate website links, pre-processing through tokenization, padding, and labeling, and splitting data into training and testing sets. The models were then trained, and grid search was employed to identify the optimal hyperparameters for each algorithm. The algorithm’s performance was calculated by accuracy, precision, recall, and F1-score metrics. The outcomes indicated that using the combination algorithm achieved 95.63% accuracy, 94.60% precision, 96.81% recall, and 95.78% f1-score. This paper concludes the proposed algorithm is effective in detecting phishing attacks on websites.
Downloads
References
G. Mohamed, J. Visumathi, M. Mahdal, J. Anand, and M. Elangovan, “An Effective and Secure Mechanism for Phishing Attacks Using a Machine Learning Approach,” Processes, vol. 10, no. 7, pp. 1356, Jul. 2022, doi: 10.3390/pr10071356.
M. W. Shaukat, R. Amin, M. M. A. Muslam, A. H. Alshehri, and J. Xie, “A Hybrid Approach for Alluring Ads Phishing Attack Detection Using Machine Learning,” Sensors, vol. 23, no. 19, pp. 8070, Sep. 2023, doi: 10.3390/s23198070.
S. Kapan and E. Sora Gunal, “Improved Phishing Attack Detection with Machine Learning: A Comprehensive Evaluation of Classifiers and Features,” Appl. Sci., vol. 13, no. 24, pp. 13269, Dec. 2023, doi: 10.3390/app132413269.
M. Sánchez-Paniagua, E. Fidalgo, E. Alegre, and R. Alaiz-Rodríguez, “Phishing websites detection using a novel multipurpose dataset and web technologies features,” Expert Syst. Appl., vol. 207, p. 118010, Jun. 2022, doi: https://doi.org/10.1016/j.eswa.2022.118010.
E. D. Frauenstein, S. Flowerday, S. Mishi, and M. Warkentin, “Unraveling the behavioral influence of social media on phishing susceptibility: A Personality-Habit-Information Processing model,” Inf. Manag., vol. 60, no. 7, p. 103858, Sep. 2023, doi: https://doi.org/10.1016/j.im.2023.103858.
L. Tang and Q. H. Mahmoud, “A Survey of Machine Learning-Based Solutions for Phishing Website Detection,” Mach. Learn. Knowl. Extr., vol. 3, no. 3, pp. 672–694, Aug. 2021, doi: 10.3390/make3030034.
P. Saravanan and S. Subramanian, “A Framework for Detecting Phishing Websites using GA based Feature Selection and ARTMAP based Website Classification,” Procedia Comput. Sci., vol. 171, pp. 1083–1092, Jun. 2020, doi: https://doi.org/10.1016/j.procs.2020.04.116.
A. Taha, “Intelligent Ensemble Learning Approach for Phishing Website Detection Based on Weighted Soft Voting,” Mathematics, vol. 9, no. 21, pp. 2799 Nov. 2021, doi: 10.3390/math9212799.
Indonesia Anti-Phishing Data Exchange, “Phishing Activity Report - 4th Quarter 2023,” 2023. https://idadx.id/ (accessed Feb. 07, 2024).
E. O. O. A. S. S. Carolyn Oreoluwa Tinubu Olorunjube James Falana and S. A. Rufai, “PHISHGEM: a mobile game-based learning for phishing awareness,” J. Cyber Secur. Technol., vol. 7, no. 3, pp. 134–153, Feb. 2023, doi: 10.1080/23742917.2023.2167276.
A. Basit, M. Zafar, X. Liu, A. R. Javed, Z. Jalil, and K. Kifayat, “A comprehensive survey of AI-enabled phishing attacks detection techniques,” Telecommun. Syst., vol. 76, no. 1, pp. 139–154, Jan. 2021, doi: 10.1007/s11235-020-00733-2.
S. Alnemari and M. Alshammari, “Detecting Phishing Domains Using Machine Learning,” Appl. Sci., vol. 13, no. 8, p. 4649, Apr. 2023, doi: 10.3390/app13084649.
Y. N. Kunang, S. Nurmaini, D. Stiawan, and B. Y. Suprapto, “Attack classification of an intrusion detection system using deep learning and hyperparameter optimization,” J. Inf. Secur. Appl., vol. 58, p. 102804, May. 2021, doi: https://doi.org/10.1016/j.jisa.2021.102804.
W. Bakasa and S. Viriri, “Vgg16 feature extractor with extreme gradient boost classifier for pancreas cancer prediction,” J. Imaging, vol. 9, no. 7, p. 138, Jul. 2023, doi: https://doi.org/10.3390/jimaging9070138.
M. Hakim, A. A. B. Omran, A. N. Ahmed, M. Al-Waily, and A. Abdellatif, “A systematic review of rolling bearing fault diagnoses based on deep learning and transfer learning: Taxonomy, overview, application, open challenges, weaknesses and recommendations,” Ain Shams Eng. J., vol. 14, no. 4, p. 101945, Apr. 2023, doi: https://doi.org/10.1016/j.asej.2022.101945.
M. M. Taye, “Understanding of Machine Learning with Deep Learning: Architectures, Workflow, Applications and Future Directions,” Computers, vol. 12, no. 5, p. 91, Apr. 2023, doi: 10.3390/computers12050091.
M. Abbasi, A. Shahraki, and A. Taherkordi, “Deep Learning for Network Traffic Monitoring and Analysis (NTMA): A Survey,” Comput. Commun., vol. 170, pp. 19–41, Mar. 2021, doi: https://doi.org/10.1016/j.comcom.2021.01.021.
N. Q. Do, A. Selamat, O. Krejcar, E. Herrera-Viedma, and H. Fujita, “Deep Learning for Phishing Detection: Taxonomy, Current Challenges and Future Directions,” IEEE Access, vol. 10, pp. 36429–36463, Apr. 2022, doi: 10.1109/ACCESS.2022.3151903.
R. R. Rajalaxmi, L. V. Narasimha Prasad, B. Janakiramaiah, C. S. Pavankumar, N. Neelima, and V. E. Sathishkumar, “Optimizing Hyperparameters and Performance Analysis of LSTM Model in Detecting Fake News on Social media,” ACM Trans. Asian Low-Resource Lang. Inf. Process., pp. 1083-1092, Mar. 2022, doi: 10.1145/3511897.
A. Al Bataineh, V. Reyes, T. Olukanni, M. Khalaf, A. Vibho, and R. Pedyuk, “Advanced Misinformation Detection: A Bi-LSTM Model Optimized by Genetic Algorithms,” Electronics, vol. 12, no. 15, p. 3250, Jul. 2023, doi: 10.3390/electronics12153250.
S. Soni, S. S. Chouhan, and S. S. Rathore, “TextConvoNet: a convolutional neural network based architecture for text classification,” Appl. Intell., vol. 53, no. 11, pp. 14249–14268, Jun. 2023, doi: 10.1007/s10489-022-04221-9.
L. Khan, A. Amjad, K. M. Afaq, and H.-T. Chang, “Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media,” Appl. Sci., vol. 12, no. 5, p. 2694, Mar. 2022, doi: 10.3390/app12052694.
X. Jiang and C. Xu, “Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data,” J. Clin. Med., vol. 11, no. 19, p. 5572, Sep. 2022, doi: 10.3390/jcm11195772.
Y. S. Taspinar, M. Koklu, and M. Altin, “Classification of flame extinction based on acoustic oscillations using artificial intelligence methods,” Case Stud. Therm. Eng., vol. 28, p. 101561, Dec. 2021, doi: 10.1016/j.csite.2021.101561.
J. Kozak, B. Probierz, K. Kania, and P. Juszczuk, “Preference-Driven Classification Measure,” Entropy, vol. 24, no. 4, p. 531, Apr. 2022, doi: 10.3390/e24040531.
Copyright (c) 2024 Ahmad Raihan, Mohammad Fadhli, Lindawati Lindawati
This work is licensed under a Creative Commons Attribution 4.0 International License.