Speech Signal Enhancement with Integrated Weighted Filtering for PSNR Reduction in Multimedia Applications

Authors

  • T.Veeramakali Associate Professor, Department of Data Science and Business Systems, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamil Nadu, 603203, India Author
  • Syed Raffi Ahamed J Assistant Professor, Department of Computer Applications, Karpaga Vinayaga College of Engineering and Technology, Maduranthagam Taluk, Tamil Nadu, 603308, India Author
  • Bagiyalakshmi N Assistant Professor, Department of Computer Science and Engineering, Rajalakshmi Engineering College, Thandalam, Mevalurkuppam, Tamil Nadu 602105, India. Author

DOI:

https://doi.org/10.69996/jcai.2024011

Keywords:

Speech Signal, Kalman Filter, Speech Enhancement, Classification, Multimedia

Abstract

This paper investigates the effectiveness of the Weighted Kalman Integrated Band Rejection (WKBR) method for enhancing speech signals in multimedia applications. Speech enhancement is crucial for improving the quality and intelligibility of audio in environments with varying noise types and levels. The WKBR method is evaluated across ten different noise scenarios, including white noise, babble noise, street noise, airplane cabin noise, and more. Performance metrics such as Peak Signal-to-Noise Ratio (PSNR), Mean Squared Error (MSE), and Short-Time Objective Intelligibility (STOI) are used to quantify the enhancement. The results show significant improvements, with PSNR increasing from an average of 12.8 dB before enhancement to 21.9 dB after enhancement, MSE reducing from an average of 0.0179 to 0.0053, and STOI scores improving from an average of 0.58 to 0.75. These findings highlight the potential of WKBR as a powerful tool for speech signal enhancement, making it a promising solution for real-world multimedia applications where clear and intelligible speech is essential.

References

[1] V. K.Padarti, G. S. Polavarapu, M.Madiraju, V. V. Naga Sai Nuthalapati, V. B. Thota et al., “A Study on Effectiveness of Deep Neural Networks for Speech Signal Enhancement in Comparison with Wiener Filtering Technique,” In Advances in Speech and Music Technology: Computational Aspects and Applications, pp. 121-135, 2022.

[2] V.Srinivasarao, “An efficient recurrent Rats function network (Rrfn) based speech enhancement through noise reduction,” Multimedia Tools and Applications, vol.81, no.21, pp.30599-30614,2022.

[3] P.Singh, A.K.Bhandari and R. Kumar, “Naturalness balance contrast enhancement using adaptive gamma with cumulative histogram and median filtering,” Optik, vol.251, pp.168251, 2022.

[4] V. R.Tank and S. P. Mahajan, “Adaptive recurrent nonnegative matrix factorization with phase compensation for Single-Channel speech enhancement,” Multimedia Tools and Applications, vol.81, no.20, pp.28249-28294, 2022.

[5] B. K.Pandey, D.Pandey, S. Wairya, G.Agarwal, P. Dadeech et al., “Application of integrated steganography and image compressing techniques for confidential information transmission,” Cyber Security and Network Security, pp.169-191, 2022.

[6] I.Schiopu and A. Munteanu, “Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement,” Sensors, vol.22, no.4, pp.1353, 2022.

[7] Y.Wang, S.Hu, S.Yin, Z.Deng and Y. H. Yang, “A multi-level wavelet-based underwater image enhancement network with color compensation prior,” Expert Systems with Applications, vol.242,pp.122710, 2024.

[8] Z. Q.Wang, G.Wichern, S.Watanabe and J. Le Roux, “STFT-domain neural speech enhancement with very low algorithmic latency,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.31, pp.397-410, 2022.

[9] A. B.Abdusalomov, F.Safarov, M.Rakhimov, B.Turaev and T.K. Whangbo, “Improved feature parameter extraction from speech signals using machine learning algorithm,” Sensors, vol.22, no.21, pp.8122, 2022.

[10] X.Bie, S. Leglaive, X.Alameda-Pineda and L. Girin, “Unsupervised speech enhancement using dynamical variational autoencoders,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.30, pp.2993-3007, 2022.

[11] K.Mannepalli, P.N.Sastry and M. Suman, “Emotion recognition in speech signals using optimization based multi-SVNN classifier,” Journal of King Saud University-Computer and Information Sciences, vol.34, no.2, pp.384-397, 2022.

[12] S. C.Venkateswarlu, N. U.Kumar, D.Veeraswamy and V. Vijay, “Speech intelligibility quality in telugu speech patterns using a wavelet-based hybrid threshold transform method,” In Intelligent systems and sustainable computing: proceedings of ICISSC 2021, pp. 449-462, 2022.

[13] S. Y.Chuang, H. M.Wang and Y. Tsao, “Improved lite audio-visual speech enhancement,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.30, pp.1345-1359,2022.

[14] Z.Huang, S.Watanabe, S.W. Yang, P. García and S. Khudanpur, “Investigating self-supervisedlearning for speech enhancement and separation,” In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6837-6841, 2022.

[15] A. A.Abdelhamid, E. S. M. El-Kenawy, B. Alotaibi, G. M. Amer, M. Y. Abdelkader et al., “Robust speech emotion recognition using CNN+ LSTM based on stochastic fractal search optimization algorithm,” IEEE Access, vol.10, pp.49265-49284, 2022.

[16] M. A.Khan, S. Abbas, A. Raza, F. Khan and T. Whangbo, “Emotion Based Signal Enhancement Through Multisensory Integration Using Machine Learning,” Computers, Materials & Continua, vol.71, no.3, 2022

Downloads

Published

2024-06-30

Issue

Section

Research Article

How to Cite

T.Veeramakali, Syed Raffi Ahamed J, & Bagiyalakshmi N. (2024). Speech Signal Enhancement with Integrated Weighted Filtering for PSNR Reduction in Multimedia Applications. Journal of Computer Allied Intelligence(JCAI, ISSN: 2584-2676), 2(3), 1-14. https://doi.org/10.69996/jcai.2024011