OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

自动化学报 2014

基于MCE准则的语音识别特征线性判别分析

DOI: 10.3724/SP.J.1004.2014.01208, PP. 1208-1215

陈斌, 张连海, 牛铜, 屈丹, 李弼程

Keywords: 线性判别分析,语音识别,核密度估计,特征变换

Full-Text Cite this paper Add to My Lib

Abstract:

？提出了一种基于最小分类错误（Minimumclassificationerror，MCE）准则的线性判别分析方法（Lineardiscriminantanalysis，LDA），并将其应用到连续语音识别中的特征变换.该方法采用非参数核密度估计方法进行数据概率分布估计；根据得到的概率分布，在最小分类错误准则下，采用基于梯度下降的线性搜索算法求解判别分析变换矩阵.利用判别分析变换矩阵对相邻帧梅尔滤波器组输出拼接的超矢量变换降维，得到时频特征.实验结果表明，与传统的MFCC特征相比，经过本文判别分析提取的时频特征其识别准确率提高了1.41%，相比于HLDA（HeteroscedasticLDA）和近似成对经验正确率准则（Approximatepairwiseempiricalaccuracycriterion，aPEAC）判别分析方法，识别准确率分别提高了1.14%和0.83%.

References

[1]	Abbasian H, Nasersharif B, Akbari A, Rahmani M. Optimized linear discriminant analysis for extracting robust speech features. In: Proceedings of the 3rd International Symposium on Communications, Control and Signal Processing. St Julians: IEEE, 2008. 819-824
[2]	Nasersharif B, Akbari A. SNR-dependent compression of enhanced Mel sub-band energies for compensation of noise effects on MFCC features. Pattern Recognition Letters, 2011, 28(11): 1320-1326
[3]	Li Bi-Cheng, Shao Mei-Zhen, Huang Jie. Pattern Recognition Theory and Application. Xi'an: Xi'an University Press, 2008. 45-52 (李弼程, 邵美珍, 黄洁. 模式识别原理与应用. 西安: 西安电子科技大学出版社, 2008. 45-52)
[4]	Kumar N, Andreou A G. Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition. Speech Communication, 1998, 26(4): 283-297
[5]	Sakai M, Kitaoka N, Nakagawa S. Linear discriminant analysis using a generalized mean of class covariances and its application to speech recognition. IEICE Transactions on Information and Systems, 2008, E91-D(3): 478-487
[6]	Loog M, Duin R P W, Haeb-Umbach R. Multiclass linear dimension reduction by weighted pairwise Fisher criteria. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(7): 762-766
[7]	Scott D W. Multivariate Density Estimation: Theory, Practice, and Visualization. New York: John Wiley and Sons, 1992. 125-190
[8]	Botev Z I, Grotowski J F, Kroese D P. Kernel density estimation via diffusion. The Annals of Statistics, 2010, 38(5): 2916-2957
[9]	Saon G, Padmanabhan M, Gopinath R, Chen S. Maximum likelihood discriminant feature spaces. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Istanbul: IEEE, 2000, 2: 1129-1132
[10]	Lee H S, Chen B. Empirical error rate minimization based linear discriminant analysis. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing. Taipei, China: IEEE, 2009. 1801-1804
[11]	Kenny P, Stafylakis T, Ouellet P. PLDA for speaker verification with utterances of arbitrary duration. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada: IEEE, 2013. 7649-7653
[12]	Kanagasundaram A, Dean D, Vogt R. Weighted LDA techniques for I-vector based speaker verification. In: Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto, Japan: IEEE, 2012. 4781-4784
[13]	Ye J X, Kobayashi T, Murakawa M. Kernel discriminant analysis for environmental sound recognition based on acoustic subspace. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada: IEEE, 2013. 808-812
[14]	Senior A, Cho Y M, Weston J. Learning improved linear transforms for speech recognition. In: Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto, Japan: IEEE, 2012. 1957-1960
[15]	Tomar V S, Rose R C. Efficient manifold learning for speech recognition using locality sensitive hashing. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada: IEEE, 2013. 6995-6999
[16]	Heigold G, Ney H, Schluter R, Wiesler S. Discriminative training for automatic speech recognition. IEEE Signal Processing Magazine, 2012, 29(5): 58-69
[17]	Juang B H, Chou W, Lee C H. Minimum classification error rate methods for speech recognition. IEEE Transactions on Speech and Audio Processing, 1997, 5(3): 257-265
[18]	Biem A, Katagiri S, McDermott E, Juang B H. An application of discriminative feature extraction to filter-bank-based speech recognition. IEEE Transaction on Speech and Audio Processing, 2001, 9(2): 96-110
[19]	Ruske G, Faltlhauser R, Pfau T. Extended linear discriminative analysis (ELDA) for speech recognition. In: Proceedings of the 1998 ICSLP Sydney. Australia: ISCA, 1998. 1473-1476
[20]	Li X B, Li J Y, Wang R H. Dimensionality reduction using MCE-optimized LDA transformation. In: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech and Signal Processing. Quebec, Canada: IEEE, 2004, 1: 137-140
[21]	Chengalvarayan R, Deng L. Use of generalized dynamic feature parameters for speech recognition. IEEE Transactions on Speech and Audio Processing, 1997, 5(3): 232-242
[22]	Wang Jun, Wang Shi-Tong, Deng Zhao-Hong, Ying Wen-Hao. Fast kernel density estimator based image thresholding algorithm for small target images. Acta Automatica Sinica, 2012, 38(10): 1679-1689(王骏, 王士同, 邓赵红, 应文豪. 面向小目标图像的快速核密度估计图像阈值分割算法. 自动化学报, 2012, 38(10): 1679-1689)
[23]	Simonoff J S. Smoothing Methods in Statistics. New York: Springer-Verlag, 1996. 53-64

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133