OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

自动化学报 2014

正交拉普拉斯语种识别方法

DOI: 10.3724/SP.J.1004.2014.01812, PP. 1812-1818

杨绪魁, 屈丹, 张文林

Keywords: 因子分析,辨识矢量,流形学习,正交局部保持投影,语种识别

Full-Text Cite this paper Add to My Lib

Abstract:

？提出了一种正交拉普拉斯语种识别方法，即在提取语音的i-vector后，采用正交局部保持投影进行子空间映射，将信号整体空间映射到语言信息加信道信息子空间，然后对映射后的矢量进行信道补偿处理，最后用支持向量机进行识别.尽管i-vector最大限度地保留了语音的声学信息，但是并没有发现这些信息之间的内在结构.利用正交局部保持投影在去除声学无关信息的基础上，进一步发现声学特征的内在结构，能够有效地提高特征的区分性.在对NISTLRE2003测试数据库实验后，发现新方法相较于基线系统来说，平均代价降低了28.91%.

References

[1]	Campbell W M, Sturim D E, Reynolds D A. Support vector machine using GMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311
[2]	Kenny P, Boulianne G, Oullet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447
[3]	Martinez D, Plchot O, Burget L, Glembek O, Matejka P. Language Recognition in iVectors Space. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 861-864
[4]	Turk M, Pentland A P. Face recognition using eigenfaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Maui, Hawaii: IEEE, 1991. 586-591
[5]	Yang J C, Liang C Y, Yang L, Suo H B, Wang J J, Yan Y H. Factor analysis of Laplacian approach for speaker recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Kyoto, Japan: IEEE, 2012. 4221-4224
[6]	He X F, Yan S C, Hu Y X, Niyogi P, Zhang H J. Face recognition using Laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2005, 27(3): 328-340
[7]	Hatch A O, Kajarekar S, Stolcke A. Within-class covariance normalization for SVM-based speaker recognition. In: INTERSPEECH. Pittsburgh, PA, USA, 2006. 1471-1474
[8]	Torres-Carrasquillo P A, Singer E, Kohler M A, Greene R J, Reynolds D A, John R, Deller J R Jr. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proceedings of the International Conferences on Spoken Language Processing (ICSLP). Denver, 2002. 89-92
[9]	Zissman M A. Comparison of four approaches to automatic language identification of telephone speech. IEEE Transactions Speech and Audio Process, 1996, 4(3): 31-44
[10]	Kenny P. Factor Analysis of Speaker and Session Variability: Theory and Algorithms, Technical Report CRIM-06/08-13. Montreal, CRIM, 2005
[11]	Dehak N, Torres P A, Reynolds D, Dehak R. Language recognition via iVectors and dimensionality reduction. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 857-860
[12]	Tipping M E, Bishop C M. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1999, 61(3): 611-622
[13]	Zeng Xian-Hua. Researches on Related Issues of Spectral Method for Manifold Learning [Ph.D. dissertation], Beijing Jiaotong University, China, 2009 (曾宪华. 流形学习的谱方法相关问题研究 [博士学位论文], 北京交通大学, 中国, 2009)
[14]	He X F, Niyogi P. Locality preserving projections. In: Proceedings of the Neural Information Processing Systems 16 (NIPS). Vancouver, Canada: The MIT Press, 2003. 153-160
[15]	Cai D, He X F. Locality preserving projections. In: Proceedings of the 28th Annual International ACM SIGIR Conference (SIGIR'05). Salvador, Brazil: ACM, 2005
[16]	Cai D, He X F, Han J W, Zhang H J. Orthogonal Laplacianfaces for face recognition. IEEE Transactions on Image Processing, 2006, 15(11): 3608-3614
[17]	The NIST 2003 Language Recognition Evaluation Plan [Online], available: http://www.itl.nist.gov/iad/mig//test/lre/ 2003/LRE03EvalPlan-v1.pdf, September 3, 2011
[18]	Chang C C, Lin C J. LIBSVM: a library for support vector machines [Online], available: http://www.csie.ntu.edu. tw/～ cjlin/, October 10, 2011

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133