Deep Learning-Based Recognition of Indonesian Sign Language (BISINDO) Alphabetic Gestures Using Skeletal Feature Extraction and LSTM

Teuku M Arief  Afwan; Rahmat  Gernowo; Helmie Arif  Wibawa

doi:10.52436/1.jutif.2026.7.2.5337

Authors

Teuku M Arief Afwan Master Program of Information System, Diponegoro University, Semarang, Indonesia
Rahmat Gernowo Doctoral Program of Information System, Diponegoro University, Semarang, Indonesia
Helmie Arif Wibawa Department of Computer Science / Informatics, Diponegoro University, Semarang, Indonesia

DOI:

https://doi.org/10.52436/1.jutif.2026.7.2.5337

Keywords:

Deep Learning, Hand gesture, LSTM, Mediapipe, Sign Language

Abstract

Communication is a fundamental aspect of human life, and for the deaf community, sign language serves as the primary medium of interaction. In Indonesia, the Indonesian Sign Language (BISINDO) is widely used, however, research on automatic BISINDO recognition remains limited due to the scarcity of representative datasets. This study presents the development of a BISINDO recognition system based on deep learning by integrating the Long Short-Term Memory (LSTM) architecture with the MediaPipe Holistic framework. To address data limitations, a custom dataset comprising 866 BISINDO alphabetic gesture videos was collected, involving recordings from both expert and non-expert signers to capture stylistic variations. Extracted skeletal landmark features were processed through a three-layer LSTM network followed by dense layers for sequential modeling and classification. Experimental results show that the proposed model achieved a validation accuracy of approximately 93%, outperforming static image–based methods and demonstrating the effectiveness of skeletal features in representing dynamic gestures. The model also exhibited real-time applicability with promising performance, although challenges such as misclassification of visually similar gestures and dataset imbalance remain. This study contributes to the underexplored field of BISINDO recognition by providing a baseline system and dataset, and further advances the domains of computer vision and human–computer interaction within informatics through an inclusive, data-driven framework for Indonesian Sign Language recognition and future AI-assisted accessibility technologies.

Downloads

Download data is not yet available.

References

R. Yunita, E. B. Nababan, and M. S. Lydia, “Indonesian Dynamic Sign Language Recognition for Individuals with Sensory Disabilities using LSTM,” in 2024 4th International Conference of Science and Information Technology in Smart Administration (ICSINTESA), Balikpapan, Indonesia: Institute of Electrical and Electronics Engineers Inc., 2024, pp. 417–420. doi: 10.1109/ICSINTESA62455.2024.10748114.

S. Baghavathi Priya, P. V. R. Subba Rao, and T. S. Madeswaran, “Enhancing Sign Language Recognition: A CNN-BiLSTM Approach for Accurate Gesture Interpretation,” in 2023 International Conference on Next Generation Electronics (NEleX), Vellore, India: Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/NEleX59773.2023.10421113.

R. Rastgoo, K. Kiani, and S. Escalera, “Sign Language Recognition: A Deep Survey,” Expert Syst Appl, vol. 164, p. 113794, Feb. 2021, doi: 10.1016/j.eswa.2020.113794.

M. H. Ismail, S. A. Dawwd, and F. H. Ali, “Arabic Sign Language Detection Using Deep Learning Based Pose Estimation,” in 2021 2nd Information Technology To Enhance e-learning and Other Application (IT-ELA), Baghdad, Iraq: Institute of Electrical and Electronics Engineers Inc., 2021, pp. 161–166. doi: 10.1109/IT-ELA52201.2021.9773404.

N. F. Attia, M. T. F. S. Ahmed, and M. A. M. Alshewimy, “Efficient deep learning models based on tension techniques for sign language recognition,” Intelligent Systems with Applications, vol. 20, p. 200284, Nov. 2023, doi: 10.1016/j.iswa.2023.200284.

S. Shinde, P. Mahalle, S. Panchal, S. Mahalle, A. Pandit, and P. Tonpe, “Sign language recognition using deep learning,” in 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India: Institute of Electrical and Electronics Engineers Inc., 2024, pp. 1–5. doi: 10.1109/ICCCNT61001.2024.10725481.

Y. Zhang and X. Jiang, “Recent Advances on Deep Learning for Sign Language Recognition,” Computer Modeling in Engineering & Sciences, vol. 139, no. 3, pp. 2399–2450, Mar. 2024, doi: 10.32604/cmes.2023.045731.

R. Alzohairi, R. Alghonaim, W. Alshehri, S. Aloqeely, M. Alzaidan, and O. Bchir, “Image based Arabic Sign Language Recognition System,” International Journal of Advanced Computer Science and Applications, vol. 9, no. 3, pp. 185–194, 2018, doi: 10.14569/IJACSA.2018.090327.

D. G. Enikeev and S. A. Mustafina, “Sign language recognition through Leap Motion controller and input prediction algorithm,” J Phys Conf Ser, vol. 1715, no. 1, p. 012008, Jan. 2021, doi: 10.1088/1742-6596/1715/1/012008.

K. Pattanaworapan, K. Chamnongthai, and J.-M. Guo, “Hand gesture recognition using codebook model and Pixel-Based Hierarchical-Feature Adaboosting,” in 2013 13th International Symposium on Communications and Information Technologies (ISCIT), Surat Thani, Thailand, 2013, pp. 544–548. doi: 10.1109/ISCIT.2013.6645918.

Z. J. Liang, S. Bin Liao, and B. Z. Hu, “3D convolutional neural networks for dynamic sign language recognition,” Comput J, vol. 61, no. 11, pp. 1724–1736, Nov. 2018, doi: 10.1093/comjnl/bxy049.

W. Hao, C. Hou, Z. Zhang, X. Zhai, L. Wang, and G. Lv, “A sensing data and deep learning-based sign language recognition approach,” Computers and Electrical Engineering, vol. 118, p. 109339, Aug. 2024, doi: 10.1016/j.compeleceng.2024.109339.

I. D. Mienye, T. G. Swart, and G. Obaido, “Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications,” Information, vol. 15, no. 9, p. 517, Aug. 2024, doi: 10.3390/info15090517.

M. De Coster, P. Rabaey, S. Verlinden, M. Van Herreweghe, and J. Dambre, “Frozen Pretrained Transformers for Neural Sign Language Translation,” in Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL), Association for Machine Translation in the Americas, 2021, pp. 88–97.

D. Indra, Purnawansyah, S. Madenda, and E. P. Wibowo, “Indonesian sign language recognition based on shape of hand gesture,” in Procedia Computer Science, Surabaya, Indonesia: Elsevier B.V., Jul. 2019, pp. 74–81. doi: 10.1016/j.procs.2019.11.101.

I. P. Sari, “Closer Look at Image Classification for Indonesian Sign Language with Few-Shot Learning Using Matching Network Approach,” International Journal on Informatics Visualization, vol. 7, no. 3, pp. 638–643, Sep. 2023, doi: 10.30630/joiv.7.3.1320.

S. Dwijayanti, S. Inas Taqiyyah, H. Hikmarika, and B. Yudho Suprapto, “Indonesia Sign Language Recognition using Convolutional Neural Network,” Int J Adv Comput Sci Appl, vol. 12, no. 10, pp. 415–422, 2021, doi: 10.14569/IJACSA.2021.0121046.

Sutarman, M. A. Majid, and J. M. Zain, “A review on the development of Indonesian sign language recognition system,” Journal of Computer Science, vol. 9, no. 11, pp. 1496–1505, 2013, doi: 10.3844/jcssp.2013.1496.1505.

I Dewa Made Bayu Atmaja Darmawan, Linawati, G. Sukadarmika, N. M. A. E. D. Wirastuti, and R. Pulungan, “Temporal Action Segmentation in Sign Language System for Bahasa Indonesia (SIBI) Videos Using Optical Flow-Based Approach,” Jurnal Ilmu Komputer dan Informasi, vol. 17, no. 2, pp. 195–202, Jun. 2024, doi: 10.21609/jiki.v17i2.1284.

A. R. M. Oropesa, G. L. R. Felicen, and J. A. De Guzman, “SENYAS: A Filipino Sign Language Recognition System Using MediaPipe and CNN-LSTM,” in TENCON 2024 - 2024 IEEE Region 10 Conference (TENCON), Singapore, Singapore: Institute of Electrical and Electronics Engineers Inc., 2024, pp. 956–960. doi: 10.1109/TENCON61640.2024.10902785.

R. A. Gani and T. A. Budi Wirayuda, “Recognizing Indonesian Sign Language (BISINDO) Alphabet Using Optimized Deep Learning,” in ICADEIS 2025 - 2025 International Conference on Advancement in Data Science, E-learning and Information System: Integrating Data Science and Information System, Proceeding, Institute of Electrical and Electronics Engineers Inc., 2025. doi: 10.1109/ICADEIS65852.2025.10933226.

R. M. Abdulhamied, M. M. Nasr, and S. N. Abdulkader, “Real-time recognition of American sign language using long-short term memory neural network and hand detection,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 30, no. 1, pp. 545–556, Apr. 2023, doi: 10.11591/ijeecs.v30.i1.pp545-556.

X. Chen et al., “The importance of short lag-time in the runoff forecasting model based on long short-term memory,” J Hydrol (Amst), vol. 589, p. 125359, Oct. 2020, doi: 10.1016/j.jhydrol.2020.125359.

L. Yongyi, L. Cewu, and T. Chi Keung, “Online Video Object Detection Using Association LSTM,” in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, Dec. 2017, pp. 2363–2371. doi: 10.1109/ICCV.2017.257.

K. B. Tran, U. D. Nguyen, and Q. T. Huynh, “Continuous Sign Language Recognition Using MediaPipe,” in 2023 International Conference on Advanced Technologies for Communications (ATC), Da Nang, Vietnam: IEEE Computer Society, 2023, pp. 493–498. doi: 10.1109/ATC58710.2023.10318855.

T. J. Sánchez-Vicinaiz, E. Camacho-Pérez, A. A. Castillo-Atoche, M. Cruz-Fernandez, J. R. García-Martínez, and J. Rodríguez-Reséndiz, “MediaPipe Frame and Convolutional Neural Networks-Based Fingerspelling Detection in Mexican Sign Language,” Technologies (Basel), vol. 12, no. 8, p. 124, Aug. 2024, doi: 10.3390/technologies12080124.

Y. Farhan and A. Ait Madi, “Real-time Dynamic Sign Recognition using MediaPipe,” in 2022 IEEE 3rd International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS), Fez, Morocco: Institute of Electrical and Electronics Engineers Inc., 2022, p. 1. doi: 10.1109/ICECOCS55148.2022.9982822.

A. Tripathi, S. Makhloga, S. Singh, S. Semwal, and V. Tomar, “SLRMPCMC: Sign Language Recognition using Mediapipe and Cross-model Comparison,” in 2024 International Conference on Electrical Electronics and Computing Technologies (ICEECT), Greater Noida, India: Institute of Electrical and Electronics Engineers Inc., 2024, pp. 1–6. doi: 10.1109/ICEECT61758.2024.10738932.

L. A. Ranee, F. L. Marshillong, S. A. Lyndoh, and A. K. Maji, “Khasi Sign Language Recognition using Google’s Mediapipe and Deep Learning Feedforward Neural Network Approach,” in International Conference on Machine Learning and Data Engineering, Dehradun, India: Procedia Computer Science, Elsevier B.V., Aug. 2025, pp. 3619–3629. doi: 10.1016/j.procs.2025.04.617.

G. Khartheesvar, M. Kumar, A. K. Yadav, and D. Yadav, “Automatic Indian sign language recognition using MediaPipe holistic and LSTM network,” Multimed Tools Appl, vol. 83, no. 20, pp. 58329–58348, Jun. 2024, doi: 10.1007/s11042-023-17361-y.

R. Cui, H. Liu, and C. Zhang, “A Deep Neural Framework for Continuous Sign Language Recognition by Iterative Training,” IEEE Trans Multimedia, vol. 21, no. 7, pp. 1880–1891, Jul. 2019, doi: 10.1109/TMM.2018.2889563.

P. Rakshit, S. Paul, and S. Dey, “Sign language detection using convolutional neural network,” J Ambient Intell Humaniz Comput, vol. 15, no. 4, pp. 2399–2424, Apr. 2024, doi: 10.1007/s12652-024-04761-7.

M. H. Ismail, S. A. Dawwd, and F. H. Ali, “Static hand gesture recognition of Arabic sign language by using deep CNNs,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 24, no. 1, pp. 178–188, Oct. 2021, doi: 10.11591/ijeecs.v24.i1.pp178-188.

N. Adaloglou et al., “A Comprehensive Study on Deep Learning-based Methods for Sign Language Recognition,” IEEE Trans Multimedia, vol. 24, pp. 1750–1762, Apr. 2021, doi: 10.1109/TMM.2021.3070438.

J. Bora, S. Dehingia, A. Boruah, A. A. Chetia, and D. Gogoi, “Real-time Assamese Sign Language Recognition using MediaPipe and Deep Learning,” in International Conference on Machine Learning and Data Engineering, Dehradun, India: Procedia Computer Science, Elsevier B.V., Sep. 2022, pp. 1384–1393. doi: 10.1016/j.procs.2023.01.117.

S. Srivastava, S. Singh, Pooja, and S. Prakash, “Continuous Sign Language Recognition System Using Deep Learning with MediaPipe Holistic,” Wirel Pers Commun, vol. 137, no. 3, pp. 1455–1468, Aug. 2024, doi: 10.1007/s11277-024-11356-0.

V. K. Chaitanya, M. Lolla, A. Barik, V. Kondapaneni, and O. K. Sikha, “Bharatnatyam Pose and Mudra Recognition Using MediaPipe and Deep Features,” in 2022 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Greater Noida, India: Institute of Electrical and Electronics Engineers Inc., 2022, pp. 635–641. doi: 10.1109/ICCCIS56430.2022.10037655.

K. Navendu and V. Sahula, “Word Level Sign Language Recognition Using MediaPipe and LSTM-GRU Network,” in 2024 IEEE International Symposium on Smart Electronic Systems (iSES), New Delhi, India: Institute of Electrical and Electronics Engineers Inc., 2024, pp. 13–18. doi: 10.1109/iSES63344.2024.00014.

M. Sankara Mahalingam, N. Suresh Kumar, C. Harika, C. Harika, C. S. Reddy, and D. P. Kalyan, “Sign to Text: Automated Sign Language Interpretation using LSTM and Computer Vision,” in 2024 3rd International Conference on Automation, Computing and Renewable Systems (ICACRS), Pudukkottai, India: Institute of Electrical and Electronics Engineers Inc., 2024, pp. 1414–1419. doi: 10.1109/ICACRS62842.2024.10841493.

H. Yoo, I. Goncharenko, and Y. Gu, “Real-Time Dynamic Sign Language Recognition Using LSTM Based on MediaPipe Hand Data,” in 2023 International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan), PingTung, Taiwan: Institute of Electrical and Electronics Engineers Inc., 2023, pp. 17–18. doi: 10.1109/ICCE-Taiwan58799.2023.10226687.

B. Sundar and T. Bagyammal, “American Sign Language Recognition for Alphabets Using MediaPipe and LSTM,” in 4th International Conference on Innovative Data Communication Technology and Application, Coimbatore, Tamil Nadu, India: Procedia Computer Science, Elsevier B.V., Nov. 2022, pp. 642–651. doi: 10.1016/j.procs.2022.12.066.

D. Li, C. Rodriguez Opazo, X. Yu, and H. Li, “Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison,” in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass, CO, USA: IEEE, 2020, pp. 1448–1458. doi: 10.1109/WACV45572.2020.9093512.

T. Fan, L. Zheheng, and Z. Dongsheng, “A deep network based integrated model for disease named entity recognition,” in 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA: IEEE, 2017, pp. 618–621. doi: 10.1109/BIBM.2017.8217723.

J. Huang, J. Chaijaruwanich, and V. Chouvatut, “Video-based Sign Language Recognition with R(2+1)D and LSTM Networks,” in 2024 16th International Conference on Knowledge and Smart Technology (KST), Krabi, Thailand: Institute of Electrical and Electronics Engineers Inc., 2024, pp. 214–219. doi: 10.1109/KST61284.2024.10499646.

S. R. Dubey, S. K. Singh, and B. B. Chaudhuri, “Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark,” Neurocomputing, vol. 503, pp. 92–108, Sep. 2022, doi: 10.1016/j.neucom.2022.06.111.

S. Prabu, T. K. Sridhar, S. Sridharan, D. Sukesh, and J. Rajavel, “Revolutionizing Communication: A Hybrid Deep Learning Framework for Enhanced Sign Language Recognition,” in 2024 International Conference on Data Science and Network Security (ICDSNS), Tiptur, India: Institute of Electrical and Electronics Engineers Inc., 2024, pp. 1–6. doi: 10.1109/ICDSNS62112.2024.10690996.

M. A. As’ari, N. A. J. Sufri, and G. S. Qi, “Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models,” International Journal of Advances in Intelligent Informatics, vol. 10, no. 1, pp. 64–78, Feb. 2024, doi: 10.26555/ijain.v10i1.1170.

D. Hand and P. Christen, “A note on using the F-measure for evaluating record linkage algorithms,” Stat Comput, vol. 28, no. 3, pp. 539–547, May 2018, doi: 10.1007/s11222-017-9746-6.

P. Das, T. Ahmed, and M. F. Ali, “Static Hand Gesture Recognition for American Sign Language using Deep Convolutional Neural Network,” in 2020 IEEE Region 10 Symposium (TENSYMP), Dhaka, Bangladesh: Institute of Electrical and Electronics Engineers Inc., Jun. 2020, pp. 1762–1765. doi: 10.1109/TENSYMP50017.2020.9230772.

G. Luo, S. Yang, G. Tian, C. Yuan, W. Hu, and S. J. Maybank, “Learning human actions by combining global dynamics and local appearance,” IEEE Trans Pattern Anal Mach Intell, vol. 36, no. 12, pp. 2466–2482, Dec. 2014, doi: 10.1109/TPAMI.2014.2329301.

A. F. Alnabih and A. Y. Maghari, “Arabic sign language letters recognition using Vision Transformer,” Multimed Tools Appl, vol. 83, pp. 81725–81739, Mar. 2024, doi: 10.1007/s11042-024-18681-3.