Speech Signal Enhancement with Integrated Weighted Filtering for PSNR Reduction in Multimedia Applications

T.Veeramakali; Syed Raffi Ahamed J; Bagiyalakshmi N

doi:10.69996/jcai.2024011

Authors

T.Veeramakali Associate Professor, Department of Data Science and Business Systems, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamil Nadu, 603203, India
Syed Raffi Ahamed J Assistant Professor, Department of Computer Applications, Karpaga Vinayaga College of Engineering and Technology, Maduranthagam Taluk, Tamil Nadu, 603308, India
Bagiyalakshmi N Assistant Professor, Department of Computer Science and Engineering, Rajalakshmi Engineering College, Thandalam, Mevalurkuppam, Tamil Nadu 602105, India.

DOI:

https://doi.org/10.69996/jcai.2024011

Keywords:

Speech Signal, Kalman Filter, Speech Enhancement, Classification, Multimedia

Abstract

This paper investigates the effectiveness of the Weighted Kalman Integrated Band Rejection (WKBR) method for enhancing speech signals in multimedia applications. Speech enhancement is crucial for improving the quality and intelligibility of audio in environments with varying noise types and levels. The WKBR method is evaluated across ten different noise scenarios, including white noise, babble noise, street noise, airplane cabin noise, and more. Performance metrics such as Peak Signal-to-Noise Ratio (PSNR), Mean Squared Error (MSE), and Short-Time Objective Intelligibility (STOI) are used to quantify the enhancement. The results show significant improvements, with PSNR increasing from an average of 12.8 dB before enhancement to 21.9 dB after enhancement, MSE reducing from an average of 0.0179 to 0.0053, and STOI scores improving from an average of 0.58 to 0.75. These findings highlight the potential of WKBR as a powerful tool for speech signal enhancement, making it a promising solution for real-world multimedia applications where clear and intelligible speech is essential.

References

[1] V. K. Padarti, G. S. Polavarapu, M. Madiraju, V. V. Naga Sai Nuthalapati, V. B. Thota et al., “A Study on Effectiveness of Deep Neural Networks for Speech Signal Enhancement in Comparison with Wiener Filtering Technique,” In Advances in Speech and Music Technology: Computational Aspects and Applications, pp.121-135, 2022.

[2] V. Srinivasarao, “An efficient recurrent Rats function network (Rrfn) based speech enhancement through noise reduction,” Multimedia Tools and Applications, vol.81, no.21, pp.30599-30614, 2022. [3] Sreedhar Bhukya, D. Srinivasarao and Khasim Saheb, “Environmental Monitoring with Wireless Sensor Network for Energy Aware Routing and Localization,” Journal of Sensors, IoT & Health Sciences, vol.1, no.1, pp.27-39, 2023.

[4] P. Brundavani, D. Vishnu Vardhan and B. Abdul Raheem, “Ffsgc-Based Classification of Environmental Factors in IOT Sports Education Data during the Covid-19 Pandemic,” Journal of Sensors, IoT & Health Sciences, vol.2, no.1, pp.28-54, 2024. [5] B. K. Pandey, D. Pandey, S. Wairya, G. Agarwal, P. Dadeech et al., “Application of integrated steganography and image compressing techniques for confidential information transmission,” Cyber Security and Network Security, pp.169-191, 2022.

[6] I. Schiopu and A. Munteanu, “Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement,” Sensors, vol.22, no.4, pp.1353, 2022.

[7] Y. Wang, S. Hu, S. Yin, Z. Deng and Y. H. Yang, “A multi-level wavelet-based underwater image enhancement network with color compensation prior,” Expert Systems with Applications, vol.242, pp.122710, 2024.

[8] Z. Q. Wang, G. Wichern, S. Watanabe and J. Le Roux, “STFT-domain neural speech enhancement with very low algorithmic latency,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.31, pp.397-410, 2022.

[9] A. B. Abdusalomov, F. Safarov, M. Rakhimov, B. Turaev and T. K. Whangbo, “Improved feature parameter extraction from speech signals using machine learning algorithm,” Sensors, vol.22, no.21, pp.8122, 2022.

[10] X. Bie, S. Leglaive, X. Alameda-Pineda and L. Girin, “Unsupervised speech enhancement using dynamical variational autoencoders,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.30, pp.2993-3007, 2022.

[11] K. Mannepalli, P. N. Sastry and M. Suman, “Emotion recognition in speech signals using optimization based multi-SVNN classifier,” Journal of King Saud University-Computer and Information Sciences, vol.34, no.2, pp.384-397, 2022.

[12] S. C. Venkateswarlu, N. U. Kumar, D. Veeraswamy and V. Vijay, “Speech intelligibility quality in telugu speech patterns using a wavelet-based hybrid threshold transform method,” In Intelligent systems and sustainable computing: proceedings of ICISSC 2021, pp. 449-462, 2022.

[13] S. Y. Chuang, H. M. Wang and Y. Tsao, “Improved lite audio-visual speech enhancement,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.30, pp.1345-1359, 2022.

[14] Z. Huang, S. Watanabe, S. W. Yang, P. García and S. Khudanpur, “Investigating self-supervised learning for speech enhancement and separation,” In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6837-6841, 2022.

[15] A. A. Abdelhamid, E. S. M. El-Kenawy, B. Alotaibi, G. M. Amer, M. Y. Abdelkader et al., “Robust speech emotion recognition using CNN+ LSTM based on stochastic fractal search optimization algorithm,” IEEE Access, vol.10, pp.49265-49284, 2022.

[16] A. Jain and B. Saha, “Blockchain integration for secure payroll transactions in Oracle Cloud HCM,” International Journal of New Research and Development, vol.5, no.12, pp.71-81, 2020.

[17] M. A.Khan, S. Abbas, A. Raza, F. Khan and T. Whangbo, “Emotion Based Signal Enhancement Through Multisensory Integration Using Machine Learning,” Computers, Materials & Continua, vol.71, no.3, 2022.

[18] L. Kumar and A. Biswanath Saha, “Evaluating the impact of AI-driven project prioritization on program success in hybrid cloud environments,” International Journal of Research in All Subjects in Multi Languages (IJRSML), vol.7, no.1, pp.78-99, 2019.

Speech Signal Enhancement with Integrated Weighted Filtering for PSNR Reduction in Multimedia Applications

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Submission via Email

Impact Factor

Keywords