Get featured on IndiaAI

Contribute your expertise or opinions and become part of the ecosystem!

Authors:

  • Shaik Sajiha, Department of Electronics & Communication Engineering, Velagapudi Ramakrishna Siddhartha Engineering College, Kanuru, Vijayawada, 520007, Andhra Pradesh, India
  • Kodali Radha, Department of Electronics & Communication Engineering, Velagapudi Ramakrishna Siddhartha Engineering College, Kanuru, Vijayawada, 520007, Andhra Pradesh, India
  • Dhulipalla Venkata Rao, Department of Electronics & Communication Engineering, Velagapudi Ramakrishna Siddhartha Engineering College, Kanuru, Vijayawada, 520007, Andhra Pradesh, India
  • Nammi Sneha, Department of Electronics & Communication Engineering, Velagapudi Ramakrishna Siddhartha Engineering College, Kanuru, Vijayawada, 520007, Andhra Pradesh, India
  • Suryanarayana Gunnam, Department of Electronics & Communication Engineering, Velagapudi Ramakrishna Siddhartha Engineering College, Kanuru, Vijayawada, 520007, Andhra Pradesh, India
  • Durga Prasad Bavirisetti, Department of Computer Science, Norwegian University of Science and Technology, 7034, Trondheim, Norway

Journal: EURASIP Journal on Audio, Speech, and Music Processing

Introduction

Dysarthria is a motor speech disorder caused by neurological damage, resulting in articulation difficulties that can severely impact an individual's ability to communicate. Early detection and accurate assessment of the severity of dysarthria are crucial for effective intervention and therapy planning. This study introduces an innovative method for automatic dysarthria detection (ADD) and severity level assessment (ADSLA) using a continuous wavelet transform (CWT)-layered convolutional neural network (CNN) model.

Objectives

Automate Dysarthria Detection: Develop a model to automatically detect dysarthria in speech signals.

  • Assess Severity Levels: Accurately classify the severity of dysarthria to facilitate appropriate therapeutic measures.
  • Optimize Signal Processing: Explore the effectiveness of different wavelets in signal representation and feature extraction.

Methodology

The proposed model leverages a CWT-layered CNN architecture to process speech signals and detect dysarthria. The continuous wavelet transform is applied to convert raw speech signals into time-frequency representations, capturing both spectral and temporal information.

Wavelets Used:

  • Amor Wavelet: Known for its ability to represent signals accurately and suppress noise, making it ideal for nuanced signal analysis.
  • Morse Wavelet: Provides good time-frequency localization, useful for identifying specific speech features.
  • Bump Wavelet: Effective for detecting transient features in signals.

The CNN model is layered on top of the CWT outputs, learning to classify dysarthria presence and severity directly from the transformed signals. The study utilized two benchmark datasets, TORGO and UA-Speech, which contain speech samples from individuals with varying levels of dysarthria and from healthy speakers.

Results

The Amor wavelet emerged as the most effective for this application, providing the highest accuracy in both detection and severity assessment tasks. The CWT-layered CNN model demonstrated the following:

  • High Accuracy: The model achieved significant accuracy in identifying dysarthria and its severity, outperforming traditional feature extraction methods.
  • Robustness to Noise: The use of wavelets, particularly the Amor wavelet, effectively suppressed noise, preserving essential speech features for analysis.
  • Comprehensive Analysis: The model's architecture allowed for the analysis of both spectral and temporal aspects of the speech signal, crucial for accurate classification.

Discussion

This study highlights the potential of combining wavelet transforms with deep learning models for speech disorder analysis. The use of CWT provided a rich representation of speech signals, while the CNN architecture effectively learned to distinguish between different severity levels of dysarthria. The findings suggest that selecting the appropriate wavelet is critical for optimizing model performance in such applications.

Conclusion

The research presents a novel approach to automatic dysarthria detection and severity assessment using a CWT-layered CNN model. The findings demonstrate the importance of wavelet selection in signal processing tasks and the effectiveness of deep learning in medical diagnostics. The Amor wavelet, in particular, was found to be highly suitable for this application, enabling accurate and efficient classification of dysarthria severity.

Implications and Future Work

The successful implementation of this model has significant implications for clinical practice, offering a non-invasive, automated tool for early dysarthria diagnosis and severity assessment. Future work should focus on expanding the dataset to include a broader range of speech samples, exploring other wavelet types, and refining the CNN architecture for even better performance. The integration of this model into clinical settings could streamline diagnostic processes and improve patient outcomes.

Image source: Unsplash

Want your Case study to get published?

Submit your case study and share your insights to the world.

Get Published Icon
ALSO EXPLORE