OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

自动化学报 2014

基于核函数的IVEC-SVM说话人识别系统研究

DOI: 10.3724/SP.J.1004.2014.00780, PP. 780-784

栗志意, 张卫强, 何亮, 刘加

Keywords: 身份认证向量后接余弦距离打分,身份认证向量后接支持向量机,Spline核,说话人识别

Full-Text Cite this paper Add to My Lib

Abstract:

？在说话人识别研究中，基于身份认证向量（Identityvector，IVEC）的说话人建模方法可以有效地提取说话人信息，是目前处于国际前沿的建模方法.本文对身份认证向量后接支持向量机（Identityvectorfollowedbysupportvectormachine，IVEC-SVM）的说话人识别系统进行了研究，对比了该系统在十种不同核函数下的识别性能，并与文献中身份认证向量后接余弦距离打分（Identityvectorfollowedbycosinedistancescoring，IVEC-CDS）系统进行了比较.在美国国家标准技术局（AmericanNationalInstituteofStandardsandTechnology，NIST）组织的2010年电话信道——电话信道说话人识别核心评测数据库上的实验结果显示，基于核函数的IVEC-SVM系统性能明显优于IVEC-CDS的系统性能.此外，实验结果表明基于Spline核的IVEC-SVM系统可取得最好的识别性能，与IVEC-CDS系统相比，其等错点（Equalerrorrate，EER）在分数归一化前后分别降低了10%和3%.

References

[1]	Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41
[2]	Kinnunen T, Li H Z. An overview of text-independent speaker recognition: from features to supervectors. Speech Communication, 2010, 52(1): 12-40
[3]	Li Zhi-Yi, He Liang, Zhang Wei-Qiang, Liu Jia. Speaker recognition based on discriminant i-vector local distance preserving projection. Journal of Tsinghua University (Science and Technology), 2012, 52(5): 598-601 (栗志意, 何亮, 张卫强, 刘加. 基于鉴别性i-vector局部距离保持映射的说话人识别. 清华大学学报(自然科学版), 2012, 52(5): 598601)
[4]	Hatch A O, Kajarekar S S, Stolcke A. Within-class covariance normalization for SVM-based speaker recognition. In: Proceedings of the International Conference on Spoken Language. Pittsburgh, PA, 2006. 1471-1474
[5]	Campbell W M, Campbell J P, Reynolds D A, Singer E, Torres-Carrasquillo P A. Support vector machines for speaker and language recognition. Computer Speech and Language, 2006, 20(2-3): 210-229
[6]	Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1448-1460
[7]	Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447
[8]	Dehak N, Kenny P J, Dehak R, Dumouchel P, Ouellet P. Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4): 788-798
[9]	Kenny P, Boulianne G, Dumouchel P. Eigenvoice modeling with sparse training data. IEEE Transactions on Speech and Audio Processing, 2005, 13(3): 345-354
[10]	Bishop C M. Pattern Recognition and Machine Learning. Berlin: Springer, 2008
[11]	Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20(3): 273-297
[12]	Sonnenburg S, R？tsch G, Henschel S, Widmer C, Behr J, Zien A, de Bona F, Binder A, Gehl C, Franc V. The SHOGUN machine learning toolbox. Journal of Machine Learning Research, 2010, 11: 1799-1802
[13]	Chang C C, Lin C J. LIBSVM: a library for support vector machines, 2001 [Online], available: http://www.csie.ntu. edu.tw/～cjlin/libsvm, September 12, 2012
[14]	NIST. The NIST Year 2010 Speaker Recognition Evaluation Plan [Online], available: http://www.nist.gov/speech/tests/sre/2010/index.html, September 12, 2012

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133