The detection of speech endpoint is an important application for speech signal processing. Although there are variance methods, the endpoint can’t be detected accurately in low SNR (Signal to Noise Ratio). The paper pointes out an endpoint detection algorithm combining two methods together: the one is improved spectral subtraction based on multitaper spectral estimation, and the other is BARK subband variance in frequency domain. Firstly, the noisy speech signal is processed though the improved spectral subtraction based on multitaper spectral estimation. It can achieve the purpose of noise reduction through this step. Then the noisy speech signal is detected using the method of BARK subband variance in frequency domain. Compared with the common endpoint detection algorithm, it is concluded that endpoint detection accuracy by new method can be improved in low SNR.
Cite this paper
Wei, J. and Sun, X. (2017). Research on Speech Endpoint Detection Algorithm with Low SNR. Open Access Library Journal, 4, e3487. doi: http://dx.doi.org/10.4236/oalib.1103487.
Sun,
Y.M., Wu,
Y.Y. and Li, P.
(2016) Research on Speech Endpoint Detection Based on the Improved Dual-Threshold. Journal of Changchun University of Science and
Technology, 1,
92-95.
Wang,
L.L., Xia, X.,
Feng, L.
and Liu,
G.C. (2014) New Speech Endpoint Detection Algorithm Based on Spectrum Variance
and Spectral Subtraction. Computer Engineering
and Applications, 8,
194-197.
Haigh, J.A.
and
Mason, J.S.
(1993) Robust Voice Activity Detection Using Cepstral Features. Proceedings of
Computer, Communication, Control and Power Engineering, Vol. 3, Beijing,
19-21 October 1993, 321-324. https://doi.org/10.1109/tencon.1993.327987
Zhao, H., Wang,
G.J. and Zhao,
L.X. (2010) A New Voice Activity Detection Using Logarithmic Energy Spectral
Entropy. Journal of Hunan University,
7, 72-77.
En, D.,
Zhang,
F.L., Zhang, Z.
and Hu,
S.Q. (2016) Application of Fuzzy Entropy in Speech Endpoint Detection in Car
Environments. Computer Engineering and
Applications, 10, 147-150.
Zheng,
J.H., Huang,
H.M., Zhong,
M.H.,
Cao,
N.W. and Chen,
Y.L. (2007) Comparative Study of Several Speech Signal Endpoint Detection
Methods. Guanagxi Wuli, 4, 20-23.
Wang, Y.,
Feng, Y.,
Ding,
X.B. and Chen,
D.Y. (2016) Endpoint Detection Algorithm for Noisy Speech Based on
Time-Frequency Combination. Journal of Natural
Science of Heilonjiang University, 3, 410-415.
Wu,
P.P., Zhao, G.
and Zou, M.
(2008) An Improved Spectral Subtraction Method Based on Multitaper Estimation. Modern Electronics Technique, 12, 150-152.
Hu, Y.
and
Loizou, P.C.
(2004) Incorporating a Psycho Acoustical Model in Frequency Domain Speech
Enhancement. IEEE Signal Processing
Letters, 11, 270-273. https://doi.org/10.1109/LSP.2003.821714
Zhang, C.L., Zeng, X.Y. and Wang,
S.G. (2012) A Voice Activity Detection Algorithm Based on the Variance of Critical
Band Power Spectrum. Technical Acoustics,
2, 204-208.
Gao,
M.M., Chang,
T.H., Yang,
G.T. and Li, M.
(2009) Speech Feature Extraction Algorithm Based on Subband Dominant Frequency
Information. Computer Engineering, 18, 161-163.
Wang,
X.H., Qu, L.,
Zhang, C.
and Jian,
X.W. (2016) Speech Feature Extraction Algorithm Based on the Bark Wavelet
Packet Transform with Fisher. Journal of
Xi’an Polytechnic University, 4, 453-457.
Wang, W.,
Hu,
G.M., Yang, L.,
Huang,
D.F. and Zhou, Y.
(2016) Research of Endpoint Detection Based on Spectral Subtraction and Uniform
Subband Spectrum Variance. Audio
Engineering, 5, 40-43.