Comparative Analysis of Face Mask Detection using Lightweight CNN and Bag of Visual Word-based Classifier for Real-Time Surveillance
DOI:
https://doi.org/10.52436/1.jutif.2026.7.1.4461Keywords:
Bag of Visual Word, Convolutional Neural Network, Face Mask Detection, Multi-layer PerceptronAbstract
Face mask detection has become increasingly important across various sectors, including healthcare, food processing industries, and public safety, to ensure adherence to health and hygiene protocols and minimize the risks of contamination. Manual supervision of mask usage is often inefficient, labor-intensive, and prone to inconsistency. To address this challenge, this study proposes an automated face mask detection system utilizing computer vision technology, designed for real-time monitoring on resource-limited edge devices, such as the Raspberry Pi 4 with a Google Coral USB Accelerator.
The system integrates Multi-task Cascaded Convolutional Neural Networks (MTCNN) for face detection and a modified lightweight Convolutional Neural Network (CNN) for binary mask classification. Deployed via a web-based platform, it captures images of non-compliant individuals and triggers alerts. To evaluate model performance, the modified CNN is compared with the Bag of Visual Words (BoVW) method using SVM and MLP classifiers on two datasets: the 12k-Face Mask Dataset (Kaggle) and a newly proposed dataset. The CNN model demonstrated higher classification performance than both BoVW-SVM and BoVW-MLP, with AUC improvements of 49% and 43% on the proposed and 12k-Face Mask datasets, respectively.
This study contributes to the advancement of computer vision-based public health monitoring by offering a robust, scalable, and real-time face mask detection system. The findings highlight the practical advantages of deep learning approaches over traditional feature extraction techniques, supporting the development of intelligent, automated surveillance systems and policy enforcement in high-risk environments, which will facilitate future advancements in AI-driven public safety solutions.
Downloads
References
A. Sharma, R. Gautam, and J. Singh, “Deep learning for face mask detection: a survey,” Multimed. Tools Appl., vol. 82, no. 22, pp. 34321–34361, Sep. 2023, doi: 10.1007/s11042-023-14686-6.
A. Kumar, A. Kalia, A. Sharma, and M. Kaushal, “A hybrid tiny YOLO v4-SPP module based improved face mask detection vision system,” J. Ambient Intell. Humaniz. Comput., vol. 14, no. 6, pp. 6783–6796, Jun. 2023, doi: 10.1007/s12652-021-03541-x.
S. Singh, U. Ahuja, M. Kumar, K. Kumar, and M. Sachdeva, “Face mask detection using YOLOv3 and faster R-CNN models: COVID-19 environment,” Multimed. Tools Appl., vol. 80, no. 13, pp. 19753–19768, May 2021, doi: 10.1007/s11042-021-10711-8.
B. Wang, Y. Zhao, and C. L. P. Chen, “Hybrid Transfer Learning and Broad Learning System for Wearing Mask Detection in the COVID-19 Era,” IEEE Trans. Instrum. Meas., vol. 70, pp. 1–12, 2021, doi: 10.1109/TIM.2021.3069844.
M. H. M. Kamil, N. Zaini, L. Mazalan, and A. H. Ahamad, “Online attendance system based on facial recognition with face mask detection,” Multimed. Tools Appl., vol. 82, no. 22, pp. 34437–34457, Sep. 2023, doi: 10.1007/s11042-023-14842-y.
S. Yadav, “Deep Learning based Safe Social Distancing and Face Mask Detection in Public Areas for COVID-19 Safety Guidelines Adherence,” Int. J. Res. Appl. Sci. Eng. Technol., vol. 8, no. 7, pp. 1368–1375, Jul. 2020, doi: 10.22214/ijraset.2020.30560.
A. H. Alyousef, “Implementing Face Detector using Viola-Jones Method,” SSRG Int. J. Electr. Electron. Eng., vol. 10, no. 7, 2023, doi: 10.14445/23488379/IJEEE-V10I7P113.
C. Zhang and Z. Zhang, “Improving multiview face detection with multi-task deep convolutional neural networks,” in IEEE Winter Conference on Applications of Computer Vision, Mar. 2014, pp. 1036–1041. doi: 10.1109/WACV.2014.6835990.
K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks,” IEEE Signal Process. Lett., vol. 23, no. 10, pp. 1499–1503, Oct. 2016, doi: 10.1109/LSP.2016.2603342.
B. Qin and D. Li, “Identifying Facemask-Wearing Condition Using Image Super-Resolution with Classification Network to Prevent COVID-19,” Sensors, vol. 20, no. 18, p. 5236, Sep. 2020, doi: 10.3390/s20185236.
I. Candradewi, B. Nurcahyo Prastowo, and D. Lathief, “GENDER CLASSIFICATION FROM FACIAL IMAGES USING SUPPORT,” J. Theor. Appl. Inf. Technol., vol. 97, no. 10, pp. 2684–2692, May 2019, doi: https://www.jatit.org/volumes/Vol97No10/4Vol97No10.pdf.
M. S. Ejaz, M. R. Islam, M. Sifatullah, and A. Sarker, “Implementation of Principal Component Analysis on Masked and Non-masked Face Recognition,” in 1st International Conference on Advances in Science, Engineering and Robotics Technology 2019, ICASERT 2019, May 2019, pp. 1–5. doi: 10.1109/ICASERT.2019.8934543.
S. Li et al., “Multi-angle Head Pose Classification when Wearing the Mask for Face Recognition under the COVID-19 Coronavirus Epidemic,” in 2020 International Conference on High Performance Big Data and Intelligent Systems (HPBD&IS), May 2020, pp. 1–5. doi: 10.1109/HPBDIS49115.2020.9130585.
G. Deore, R. Bodhula, V. Udpikar, and V. More, “Study of masked face detection approach in video analytics,” in 2016 Conference on Advances in Signal Processing (CASP), Jun. 2016, pp. 196–200. doi: 10.1109/CASP.2016.7746164.
M. S. Ejaz and M. R. Islam, “Masked Face Recognition Using Convolutional Neural Network,” in 2019 International Conference on Sustainable Technologies for Industry 4.0 (STI), Dec. 2019, vol. 0, pp. 1–6. doi: 10.1109/STI47673.2019.9068044.
C. V. Bhargavi, G. Mani, N. Cherukuri, C. Prasad, A. Krishna, and C. Z. Basha, “A Novel Framework for Facemask Detection Using R-Convolution Neural Network,” in 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA), Sep. 2021, pp. 958–962. doi: 10.1109/ICIRCA51532.2021.9544775.
S. Ge, J. Li, Q. Ye, and Z. Luo, “Detecting Masked Faces in the Wild with LLE-CNNs,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, vol. 2017-Janua, pp. 426–434. doi: 10.1109/CVPR.2017.53.
N. Ullah, A. Javed, M. Ali Ghazanfar, A. Alsufyani, and S. Bourouis, “A novel DeepMaskNet model for face mask detection and masked facial recognition,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 10, pp. 9905–9914, Nov. 2022, doi: 10.1016/j.jksuci.2021.12.017.
Q. Mudassar Ilyas and M. Ahmad, “An Enhanced Deep Learning Model for Automatic Face Mask Detection,” Intell. Autom. Soft Comput., vol. 31, no. 1, pp. 241–254, 2022, doi: 10.32604/iasc.2022.018042.
A. Kumar, A. Kalia, K. Verma, A. Sharma, and M. Kaushal, “Scaling up face masks detection with YOLO on a novel dataset,” Optik (Stuttg)., vol. 239, p. 166744, Aug. 2021, doi: 10.1016/j.ijleo.2021.166744.
C. Vimal and N. Shirivastava, “Face and Face-mask Detection System using VGG-16 Architecture based on Convolutional Neural Network,” Int. J. Comput. Appl., vol. 183, no. 50, pp. 16–21, Feb. 2022, doi: 10.5120/ijca2022921700.
M. Mobaraki et al., “Masked Face Recognition Using Convolutional Neural Networks and Similarity Analysis,” in 2023 24th International Conference on Digital Signal Processing (DSP), Jun. 2023, vol. 2023-June, pp. 1–5. doi: 10.1109/DSP58604.2023.10167977.
M. Loey, G. Manogaran, M. H. N. Taha, and N. E. M. Khalifa, “Fighting against COVID-19: A novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection,” Sustain. Cities Soc., vol. 65, p. 102600, Feb. 2021, doi: 10.1016/j.scs.2020.102600.
R. Padilla, W. L. Passos, T. L. B. Dias, S. L. Netto, and E. A. B. Da Silva, “A comparative analysis of object detection metrics with a companion open-source toolkit,” Electron., vol. 10, no. 3, pp. 1–28, Jan. 2021, doi: 10.3390/electronics10030279.
T.-H. Tsai, J.-X. Lu, X.-Y. Chou, and C.-Y. Wang, “Joint Masked Face Recognition and Temperature Measurement System Using Convolutional Neural Networks,” Sensors, vol. 23, no. 6, p. 2901, Mar. 2023, doi: 10.3390/s23062901.
H. Farman, T. Khan, Z. Khan, S. Habib, M. Islam, and A. Ammar, “Real-Time Face Mask Detection to Ensure COVID-19 Precautionary Measures in the Developing Countries,” Appl. Sci., vol. 12, no. 8, p. 3879, Apr. 2022, doi: 10.3390/app12083879.
A. Kanavos, O. Papadimitriou, K. Al-Hussaeni, M. Maragoudakis, and I. Karamitsos, “Real-Time Detection of Face Mask Usage Using Convolutional Neural Networks,” Computers, vol. 13, no. 7, p. 182, Jul. 2024, doi: 10.3390/computers13070182.
P. Nagrath, R. Jain, A. Madan, R. Arora, P. Kataria, and J. Hemanth, “SSDMNV2: A real time DNN-based face mask detection system using single shot multibox detector and MobileNetV2,” Sustain. Cities Soc., vol. 66, p. 102692, Mar. 2021, doi: 10.1016/j.scs.2020.102692.
B. Kocacinar, B. Tas, F. P. Akbulut, C. Catal, and D. Mishra, “A Real-Time CNN-Based Lightweight Mobile Masked Face Recognition System,” IEEE Access, vol. 10, pp. 63496–63507, 2022, doi: 10.1109/ACCESS.2022.3182055.
X. Jiang, T. Gao, Z. Zhu, and Y. Zhao, “Real-Time Face Mask Detection Method Based on YOLOv3,” Electronics, vol. 10, no. 7, p. 837, Apr. 2021, doi: 10.3390/electronics10070837.
M. Aly, M. Munich, and P. Perona, “BAG OF WORDS FOR LARGE SCALE OBJECT RECOGNITION - Properties and Benchmark,” in Proceedings of the International Conference on Computer Vision Theory and Applications, 2011, pp. 299–306. doi: 10.5220/0003311402990306.
K. Budiarta, D. M. Wiharta, and K. O. Saputra, “Balinese Mask Characters Classification using Bag of Visual Words Model,” J. Electr. Electron. Informatics, vol. 5, no. 1, p. 25, Feb. 2021, doi: 10.24843/jeei.2021.v05.i01.p05.
E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An efficient alternative to SIFT or SURF,” in 2011 International Conference on Computer Vision, Nov. 2011, pp. 2564–2571. doi: 10.1109/ICCV.2011.6126544.
J. Yang, Y.-G. Jiang, A. G. Hauptmann, and C.-W. Ngo, “Evaluating bag-of-visual-words representations in scene classification,” in Proceedings of the international workshop on Workshop on multimedia information retrieval, Sep. 2007, pp. 197–206. doi: 10.1145/1290082.1290111.
A. N. Liyantoko, I. Candradewi, and A. Harjoko, “Klasifikasi Sel Darah Putih dan Sel Limfoblas Menggunakan Metode Multilayer Perceptron Backpropagation,” IJEIS (Indonesian J. Electron. Instrum. Syst., vol. 9, no. 2, p. 173, Oct. 2019, doi: 10.22146/ijeis.49943.
S. Liu and W. Deng, “Very deep convolutional neural network based image classification using small training sample size,” Proc. - 3rd IAPR Asian Conf. Pattern Recognition, ACPR 2015, pp. 730–734, 2016, doi: 10.1109/ACPR.2015.7486599.
Additional Files
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Ika Candradewi, Bakhtiar Aldino Ardi S, Agus Harjoko, Andi Dharmawan

This work is licensed under a Creative Commons Attribution 4.0 International License.





