In this paper, a new method for making v/uv decision is developed which uses a multi-feature v/uv classification algorithm based on the analysis of cepstral peak, zero crossing rate, and autocorrelation function (ACF) peak of short-time segments of the speech signal by using some clustering methods. This v/uv classifier achieved excellent results for identification of voiced and unvoiced segments of speech.
E. Fisher, J. Tabrikian and S. Dubnov, “Generalized Likelihood Ratio Test for Voiced-Unvoiced Decision in Noisy Speech Using the Harmonic Model,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14, No. 2, 2006, pp. 502-510. doi:10.1109/TSA.2005.857806
Y. Qi and B. R. Hunt, “Voiced-Unvoiced-Silence Classifications of Speech Using Hybrid Features and a Network Classifier,” IEEE Transactions on Speech and Audio Processing, Vol. 1, No. 2, 2002, pp. 250-255.
B. Atal and L. Rabiner, “A Pattern Recognition Approach to Voicedunvoiced-Silence Classification with Applications to Speech Recognition,” IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 24, No. 3, 2003, pp. 201-212. doi:10.1109/TASSP.1976.1162800
B. Atal and M. Schroeder, “Predictive Coding of Speech Signals and Subjective Error Criteria,” IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 27, No. 3, 2003, pp. 247-254.
P. A. Naylor, A. Kounoudes, J. Gudnason and M. Brookes, “Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm,” IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 1, 2007, pp. 34-43. doi:10.1109/TASL.2006.876878
Z. D. Zhao, X. M. Hu and J. F. Tian, “An Effective Pitch Detection Method for Speech Signals with Low Signal-to-Noise Ratio,” International Conference on Machine Learning and Cybernetics, Vol. 5, 2008, pp. 2775-2778.
J. K. Shah, A. N. Iyer, B. Y. Smolenski and R. E. Yantorno, “Robust Voiced/Unvoiced Classification Using Novel Features and Gaussian Mixture Model,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, 2004, pp. 17-21.
R. G. Bachu, S. Kopparthi, B. Adapa and B. D. Barkana, “Separation of Voiced and Unvoiced Using Zero Crossing Rate and Energy of the Speech Signal,” American Society for Engineering Education (ASEE) Zone Conference Proceedings, 2008, pp. 1-7.
M. S. Rahman and T. Shimamura, “Pitch Determination Using Autocorrelation Function in Spectral Domain,” Eleventh Annual Conference of the International Speech Communication Association, Makuhari, 2010, pp. 653-656.
R. J. McAulay and T. F. Quatieri, “Pitch Estimation and Voicing Detection Based on a Sinusoidal Speech Model,” International Conference on Acoustics, Speech, and Signal Processing, Vol. 1, 1990, pp. 249-252.
L. Siegel, “A Procedure for Using Pattern Classification Techniques to Obtain a Voiced/Unvoiced Classifier,” IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 27, No. 1, 2003, pp. 83-89.
L. Siegel and A. Bessey, “Voiced/Unvoiced/Mixed Excitation Classification of Speech,” IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 30, No. 3, 2003, pp. 451-460. doi:10.1109/TASSP.1982.1163910
S. Ahmadi and A. S. Spanias, “Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm,” IEEE Transactions on Speech and Audio Processing, Vol. 7, No. 3, 2002, pp. 333-338.