全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

正交拉普拉斯语种识别方法

DOI: 10.3724/SP.J.1004.2014.01812, PP. 1812-1818

Keywords: 因子分析,辨识矢量,流形学习,正交局部保持投影,语种识别

Full-Text   Cite this paper   Add to My Lib

Abstract:

?提出了一种正交拉普拉斯语种识别方法,即在提取语音的i-vector后,采用正交局部保持投影进行子空间映射,将信号整体空间映射到语言信息加信道信息子空间,然后对映射后的矢量进行信道补偿处理,最后用支持向量机进行识别.尽管i-vector最大限度地保留了语音的声学信息,但是并没有发现这些信息之间的内在结构.利用正交局部保持投影在去除声学无关信息的基础上,进一步发现声学特征的内在结构,能够有效地提高特征的区分性.在对NISTLRE2003测试数据库实验后,发现新方法相较于基线系统来说,平均代价降低了28.91%.

References

[1]  Campbell W M, Sturim D E, Reynolds D A. Support vector machine using GMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311
[2]  Kenny P, Boulianne G, Oullet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447
[3]  Martinez D, Plchot O, Burget L, Glembek O, Matejka P. Language Recognition in iVectors Space. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 861-864
[4]  Turk M, Pentland A P. Face recognition using eigenfaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Maui, Hawaii: IEEE, 1991. 586-591
[5]  Yang J C, Liang C Y, Yang L, Suo H B, Wang J J, Yan Y H. Factor analysis of Laplacian approach for speaker recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Kyoto, Japan: IEEE, 2012. 4221-4224
[6]  He X F, Yan S C, Hu Y X, Niyogi P, Zhang H J. Face recognition using Laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2005, 27(3): 328-340
[7]  Hatch A O, Kajarekar S, Stolcke A. Within-class covariance normalization for SVM-based speaker recognition. In: INTERSPEECH. Pittsburgh, PA, USA, 2006. 1471-1474
[8]  Torres-Carrasquillo P A, Singer E, Kohler M A, Greene R J, Reynolds D A, John R, Deller J R Jr. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proceedings of the International Conferences on Spoken Language Processing (ICSLP). Denver, 2002. 89-92
[9]  Zissman M A. Comparison of four approaches to automatic language identification of telephone speech. IEEE Transactions Speech and Audio Process, 1996, 4(3): 31-44
[10]  Kenny P. Factor Analysis of Speaker and Session Variability: Theory and Algorithms, Technical Report CRIM-06/08-13. Montreal, CRIM, 2005
[11]  Dehak N, Torres P A, Reynolds D, Dehak R. Language recognition via iVectors and dimensionality reduction. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 857-860
[12]  Tipping M E, Bishop C M. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1999, 61(3): 611-622
[13]  Zeng Xian-Hua. Researches on Related Issues of Spectral Method for Manifold Learning [Ph.D. dissertation], Beijing Jiaotong University, China, 2009 (曾宪华. 流形学习的谱方法相关问题研究 [博士学位论文], 北京交通大学, 中国, 2009)
[14]  He X F, Niyogi P. Locality preserving projections. In: Proceedings of the Neural Information Processing Systems 16 (NIPS). Vancouver, Canada: The MIT Press, 2003. 153-160
[15]  Cai D, He X F. Locality preserving projections. In: Proceedings of the 28th Annual International ACM SIGIR Conference (SIGIR'05). Salvador, Brazil: ACM, 2005
[16]  Cai D, He X F, Han J W, Zhang H J. Orthogonal Laplacianfaces for face recognition. IEEE Transactions on Image Processing, 2006, 15(11): 3608-3614
[17]  The NIST 2003 Language Recognition Evaluation Plan [Online], available: http://www.itl.nist.gov/iad/mig//test/lre/ 2003/LRE03EvalPlan-v1.pdf, September 3, 2011
[18]  Chang C C, Lin C J. LIBSVM: a library for support vector machines [Online], available: http://www.csie.ntu.edu. tw/~ cjlin/, October 10, 2011

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133