Stacking-Based Support Vector Machine and Multilayer Perceptron for Dysarthria Detection Using MFCC Features
DOI:
https://doi.org/10.52436/1.jutif.2025.6.4.5199Keywords:
Dysarthria, Mel-Frequency Cepstral Coefficients, Multilayer Perceptron, Stacking, Support Vector Machine, Voice ClassificationAbstract
The manual diagnosis of dysarthria is often time-consuming and requires the expertise of trained specialists, which can delay early intervention and treatment. This study aims to develop an automated detection system to improve diagnostic accuracy and efficiency. Mel-Frequency Cepstral Coefficients (MFCC) are used as the primary features, and three classification models are evaluated: Support Vector Machine (SVM), Multilayer Perceptron (MLP), and a stacking ensemble that combines both. The evaluation is conducted on a dataset of 240 audio samples. Experimental results show that the stacking ensemble achieves the highest performance, with an accuracy of 97.92%, surpassing SVM (95.83%) and MLP (93.75%). These findings highlight the significant potential of voice-based classification to accelerate dysarthria diagnosis, thus supporting clinical screening and speech therapy applications.
Downloads
References
A. Al-Ali et al., “The Detection of Dysarthria Severity Levels Using AI Models: A Review,” IEEE Access, vol. 12, no. January, pp. 48223–48238, 2024, doi: 10.1109/ACCESS.2024.3382574.
M. Laganaro et al., “Sensitivity and specificity of an acoustic- and perceptual-based tool for assessing motor speech disorders in French: the MonPaGe-screening protocol,” Clin. Linguist. Phonetics, vol. 35, no. 11, pp. 1060–1075, 2021, doi: 10.1080/02699206.2020.1865460.
M. Bourqui, M. Lancheros, F. Assal, and M. Laganaro, “The encoding of speech modes in motor speech disorders : whispered versus normal speech in apraxia of speech and hypokinetic dysarthria,” Clin. Linguist. Phon., vol. 39, no. 2, pp. 99–120, 2025, doi: 10.1080/02699206.2024.2345353.
A. Fadlil, L. Perdana, A. Pujiyanta, Herman, H. I. K. Fathurrahman, and M. M. J. Samodro, “Implementation of Dysarthria Identification Using MFCC and Multilayer Perceptron Algorithm,” SSRG Int. J. Electr. Electron. Eng., vol. 12, no. 1, pp. 32–46, 2025, doi: 10.14445/23488379/IJEEE-V12I1P105.
E. Roepke, “Assessing Phonological Processing in Children With Speech Sound Disorders,” pp. 1–21, 2023.
H. Kheddar, M. Hemis, and Y. Himeur, “Automatic speech recognition using advanced deep learning approaches: A survey,” Inf. Fusion, vol. 109, 2024, doi: 10.1016/j.inffus.2024.102422.
H. Nasir and M. A. Zahid, “Chlorpromazine-Induced Neurological Symptoms Mimicking Stroke in an Elderly Patient with Intractable Hiccups: A Case Report,” J. Heal. Rehabil. Res., vol. 4, no. 1, pp. 995–999, 2024, doi: 10.61919/jhrr.v4i1.405.
M. Saad, Q. Maha, and M. Talal, “Disseminated Salmonella Typhi Infection Presenting with Slurred Speech and Encephalopathy: An Unusual Presentation,” Natl. J. Heal. Sci., vol. 9, no. 2, pp. 131–136, 2024, doi: 10.21089/njhs.92.0131.
A. W. Jones, “Dubowski ’ s stages of alcohol influence and clinical signs and symptoms of drunkenness in relation to a person ’ s blood-alcohol concentration — Historical background,” no. February, pp. 131–140, 2024.
S. E. E. Profile, “Real-time Speech-based Intoxication Detection System : Vowel Biomarker Real-time Speech-based Intoxication Detection System : Vowel Biomarker Analysis with Artificial Neural Networks,” no. August, 2024, doi: 10.12785/ijcds/1501116.
W. Yu et al., “Connecting Speech Encoder and Large Language Model for Asr,” ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., pp. 12637–12641, 2024, doi: 10.1109/ICASSP48485.2024.10445874.
K. Radha, M. Bansal, and V. R. Dulipalla, “Variable STFT Layered CNN Model for Automated Dysarthria Detection and Severity Assessment Using Raw Speech,” Circuits, Syst. Signal Process., vol. 43, no. 5, pp. 3261–3278, 2024, doi: 10.1007/s00034-024-02611-7.
D. Vision and I. G. Disturbance, “Freiburg Neuropathology Case Conference :,” pp. 279–286, 2024, doi: 10.1007/s00062-024-01385-4.
F. Javanmardi, S. R. Kadiri, and P. Alku, “Pre-trained models for detection and severity level classification of dysarthria from speech,” Speech Commun., vol. 158, no. February, p. 103047, 2024, doi: 10.1016/j.specom.2024.103047.
R. Zhou, S. Zhao, M. Luo, X. Meng, J. Ma, and J. Liu, “MFCC based real-time speech reproduction and recognition using distributed acoustic sensing technology,” Optoelectron. Lett., vol. 20, no. 4, pp. 222–227, 2024, doi: 10.1007/s11801-024-3167-5.
M. S. Sidhu, N. Atiqah, A. Latib, K. K. Kulwant, and S. Jumahat, “MFCC in Audio Signal Processing For Voice Disorder : A Review Classification of Non-Organic Voice Disorder Using Mel-Frequency Cepstral Coefficient ( MFCC ) with Support Vector Machine ( SVM ),” 2023.
Y. Badr, P. Mukherjee, and S. M. Thumati, “Speech Emotion Recognition using MFCC and Hybrid Neural Networks,” Int. Jt. Conf. Comput. Intell., vol. 1, no. Ijcci 2021, pp. 366–373, 2021, doi: 10.5220/0010707400003063.
N. A. Zainal, A. L. Asnawi, A. Z. Jusoh, S. N. Ibrahim, and H. A. M. Ramli, “Integration of Mfccs and Cnn for Multiclass Stress Speech Classification on Unscripted Dataset,” IIUM Eng. J., vol. 25, no. 2, pp. 381–395, 2024, doi: 10.31436/iiumej.v25i2.3207.
W. Jitchaijaroen, S. Keawsawasvong, W. Wipulanusat, D. R. Kumar, P. Jamsawang, and J. Sunkpho, “Machine learning approaches for stability prediction of rectangular tunnels in natural clays based on MLP and RBF neural networks,” Intell. Syst. with Appl., vol. 21, no. December 2023, p. 200329, 2024, doi: 10.1016/j.iswa.2024.200329.
N. B. Gaikwad et al., “Hardware Design and Implementation of Multiagent MLP Regression for the Estimation of Gunshot Direction on IoBT Edge Gateway,” IEEE Sens. J., vol. 23, no. 13, pp. 14549–14557, 2023, doi: 10.1109/JSEN.2023.3278748.
A. Alsirhani, M. Mujib Alshahrani, A. Abukwaik, A. I. Taloba, R. M. Abd El-Aziz, and M. Salem, “A novel approach to predicting the stability of the smart grid utilizing MLP-ELM technique,” Alexandria Eng. J., vol. 74, pp. 495–508, 2023, doi: 10.1016/j.aej.2023.05.063.
Y. Hauptman et al., “Identifying distinctive acoustic and spectral features in Parkinson’s disease,” Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, vol. 2019-Septe, no. September, pp. 2498–2502, 2019, doi: 10.21437/Interspeech.2019-2465.
Z. Soumaya, B. D. Taoufiq, B. Nsiri, and A. Abdelkrim, “Diagnosis of Parkinson disease using the wavelet transform and MFCC and SVM classifier,” Proc. 2019 IEEE World Conf. Complex Syst. WCCS 2019, vol. 4, pp. 1–6, 2019, doi: 10.1109/ICoCS.2019.8930802.
Q. Gao et al., “Electroencephalogram signal classification based on Fourier transform and Pattern Recognition Network for epilepsy diagnosis,” Eng. Appl. Artif. Intell., vol. 123, no. June, p. 106479, 2023, doi: 10.1016/j.engappai.2023.106479.
J. Naskath, G. Sivakamasundari, and A. A. S. Begum, “A Study on Different Deep Learning Algorithms Used in Deep Neural Nets: MLP SOM and DBN,” Wirel. Pers. Commun., vol. 128, no. 4, pp. 2913–2936, 2023, doi: 10.1007/s11277-022-10079-4.
A. Abbaskhah, H. Sedighi, and H. Marvi, “Infant cry classification by MFCC feature extraction with MLP and CNN structures,” Biomed. Signal Process. Control, vol. 86, no. PB, p. 105261, 2023, doi: 10.1016/j.bspc.2023.105261.
Y. Wei, J. Jang-Jaccard, F. Sabrina, A. Singh, W. Xu, and S. Camtepe, “AE-MLP: A Hybrid Deep Learning Approach for DDoS Detection and Classification,” IEEE Access, vol. 9, pp. 146810–146821, 2021, doi: 10.1109/ACCESS.2021.3123791.
A. Lauraitis, R. Maskeliunas, R. Damaševičius, and T. Krilavičius, “Detection of Speech Impairments Using Cepstrum, Auditory Spectrogram and Wavelet Time Scattering Domain Features,” IEEE Access, vol. 8, pp. 96162–96172, 2020, doi: 10.1109/ACCESS.2020.2995737.
A. Tsanas, M. A. Little, C. Fox, and L. O. Ramig, “Objective automatic assessment of rehabilitative speech treatment in Parkinson’s disease,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 22, no. 1, pp. 181–190, 2014, doi: 10.1109/TNSRE.2013.2293575.
W. Caesarendra, F. T Putri, M. Ariyanto, and J. D Setiawan, “Pattern recognition methods for multi stage classification of Parkinson’s disease utilizing voice features,” IEEE/ASME Int. Conf. Adv. Intell. Mechatronics, AIM, vol. 2015-Augus, no. 1, pp. 802–807, 2015, doi: 10.1109/AIM.2015.7222636.
Additional Files
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ardi Pujiyanta, Fiftin Noviyanto, Taufiq Ismail

This work is licensed under a Creative Commons Attribution 4.0 International License.