全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Single Channel Speech Enhancement Techniques in Spectral Domain

DOI: 10.5402/2012/919234

Full-Text   Cite this paper   Add to My Lib

Abstract:

This paper presents single-channel speech enhancement techniques in spectral domain. One of the most famous single channel speech enhancement techniques is the spectral subtraction method proposed by S.F. Boll in 1979. In this method, an estimated speech spectrum is obtained by simply subtracting a preestimated noise spectrum from an observed one. Hence, the spectral subtraction method is not concerned with speech spectral properties. It is well known that the spectral subtraction method produces an annoying artificial noise in the extracted speech signal. On the other hand, recent successful speech enhancement methods positively utilize the speech property and achieve an efficient speech enhancement capability. This paper presents a historical review about some speech estimation techniques and explicitly states the difference between their theoretical back-ground. Moreover, to evaluate their speech enhancement capabilities, we perform computer simulations. The results show that an adaptive speech enhancement method based on MAP estimation gives the best noise reduction capability in comparison to other speech enhancement methods presented in this paper. 1. Introduction In recent years, speech enhancement is required in a wide area of applications including mobile communication and speech recognition systems, where the major example is a cell-phone as shown in Figure 1. Many speech enhancement methods have been established in decades [1–15]. These speech enhancement techniques can be classified to time domain methods and spectral domain methods. Recent major speech enhancement techniques are of the spectral domain method which is preferably used in a cell phone. In this paper, we focus on the spectral domain speech enhancement techniques that employ a single microphone. Figure 1: Application of speech enhancement. The spectral subtraction method [3] is one of the most popular methods among numerous noise reduction techniques in spectral domain. This method achieves noise reduction by simply subtracting a pre-estimated noise spectral amplitude from an observed spectral amplitude, where the spectral phase is not processed. The spectral subtraction method is easy for implementation and effectively reduces stationary noises. However, it incurs an artificial noise, called musical noise, which is caused from speech estimation errors. Because the spectral subtraction method is not concerned with speech spectral information, it often gives estimation errors. Ephraim and Malah have proposed the MMSE-STSA (Minimum Mean Square Error-Short-Time Spectral Amplitude)

References

[1]  M. Muneyasu and A. Taguchi, Nonlinear Digital Signal Processing, Asakura Publishing, Tokyo, Japan, 1999.
[2]  A. Kawamura, Y. Iiguni, and Y. Itoh, “A noise reduction method based on linear prediction with variable step-size,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E88-A, no. 4, pp. 855–861, 2005.
[3]  S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no. 2, pp. 113–120, 1979.
[4]  Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 6, pp. 1109–1121, 1984.
[5]  B. Widrow, J. G. R. Glover Jr., J. M. Mccool, et al., “Adaptive noise cancelling: principles and applications,” Proceedings of The IEEE, vol. 63, no. 12, pp. 1692–1716, 1975.
[6]  P. J. Wolfe and S. J. Godsill, “Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement,” Eurasip Journal on Applied Signal Processing, vol. 2003, no. 10, pp. 1043–1051, 2003.
[7]  R. J. McAulay and M. L. Malpass, “Speech enhancement using a soft-decision noise suppression filter,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 2, pp. 137–145, 1980.
[8]  B. Chen and P. C. Loizou, “Speech enhancement using a MMSE short time spectral amplitude estimator with laplacian speech modeling,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), pp. I1097–I1100, March 2005.
[9]  R. Martin, “Speech enhancement based on minimum mean-square error estimation and supergaussian priors,” IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 845–856, 2005.
[10]  S. Gazor and W. Zhang, “Speech enhancement employing laplacian-gaussian mixture,” IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 896–904, 2005.
[11]  T. Lotter and P. Vary, “Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model,” Eurasip Journal on Applied Signal Processing, vol. 2005, no. 7, pp. 1110–1126, 2005.
[12]  I. Andrianakis and P. R. White, “Speech spectral amplitude estimators using optimally shaped Gamma and Chi priors,” Speech Communication, vol. 51, no. 1, pp. 1–14, 2009.
[13]  Y. Tsukamoto, A. Kawamura, and Y. Iiguni, “Speech enhancement based on MAP estimation using a variable speech distribution,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E90-A, no. 8, pp. 1587–1593, 2007.
[14]  A. Kawamura, W. Thanhikam, and Y. Iiguni, “A speech spectral estimator using adaptive speech probability density function,” in Proceedings of the EUSIPCO 2010, pp. 1549–1552, August 2010.
[15]  W. Thanhikam, A. Kawamura, and Y. Iiguni, “Speech enhancement using speech model parameters refined by two-step technique,” in Proceedings of the 2nd APSIPA Annual Summit and Conference, p. 11, December 2010.
[16]  W. Thanhikam, A. Kawamura, and Y. Iiguni, “Speech enhancement based on real-speech PDF in various narrow SNR intervals,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E95-A, no. 3, pp. 623–630, 2012.
[17]  S. Furui, Digital Speech Processing, Tokai University Press, Tokyo, Japan, 1985.
[18]  S. L. Miller and D. G. Childers, Probability and Random Processes, Elsevier/Academic Press, 2004.
[19]  M. Kato, A. Sugiyama, and M. Serizawa, “Noise suppression with high speech quality based on weighted noise estimation and MMSE STSA,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E85-A, no. 7, pp. 1710–1718, 2002.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133