OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

自动化学报 2014

高斯PLDA在说话人确认中的应用及其联合估计

DOI: 10.3724/SP.J.1004.2014.01068, PP. 1068-1074

许云飞, 杨海, 周若华, 颜永红

Keywords: 因子分析,总变化因子,概率线性鉴别分析,联合估计,期望最大化

Full-Text Cite this paper Add to My Lib

Abstract:

？近年来，基于总变化因子的说话人识别方法成为说话人识别领域的主流方法.其中，概率线性鉴别分析（Probabilisticlineardiscriminantanalysis，PLDA）因其优异的性能而得到学者们的广泛关注.然而，在估计PLDA模型时，传统的因子分析方法只更新模型空间，因此，模型均值不能很好地与更新后的模型空间耦合.提出联合估计法对模型均值和模型空间同时估计，得到更为严格的期望最大化更新公式，在美国国家标准与技术局说话人识别评测2010扩展测试数据库以及2012核心测试数据库上，等错率得到一定提升.

References

[1]	McLaren M, Leeuwen D A V. Sourcenormalised and weighted lda for robust speaker recognition using i-vectors. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague, Czech Republic: IEEE, 2011. 5456-5459
[2]	Simon J D P, James H E. Probabilistic linear discriminant analysis for inferences about identity. In: Proceedings of International Conference on Computer Vision. Rio de Janeiro, Brazil: IEEE, 2007. 1-8
[3]	Dehak N, Karam Z, Reynolds D, Dehak R, Campbell W, Glass J. A channel-blind system for speaker verification. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague, Czech Republic: IEEE, 2011. 4536-4539
[4]	Garcia Romero D, Espy Wilson C. Analysis of i-vector length normalization in speaker recognition systems. In: Proceedings of International Conference on Speech Communication and Technology. Florence, Italy: IEEE, 2011. 249-252
[5]	Bousquet P M, Larcher A, Matrouf D, Bonastre J F, O Plchot. Variance-spectra based normalization for i-vector standard and probabilistic linear discriminant analysis. In: Proceedings of Odyssey Speaker and Language Recognition Workshop. Biopolis, Singapore: 2012. 157-164
[6]	Brummer N, de Villiers E. The speaker partitioning problem. In: Proceedings of Odyssey Speaker and Language Recognition Workshop. Brno, Czech Republic: 2010. 194-201
[7]	Christopher M Bishop. Pattern Recognition and Machine Learning. Singapore: Springer, 2006. 439-441
[8]	Villalba J, Brümmer N. Towards fully Bayesian speaker recognition: integrating out the between speaker covariance. In: Proceedings of International Conference on Speech Communication and Technology. Florence, Italy: IEEE, 2011. 505-508
[9]	Kenny P. Bayesian speaker verification with heavy-tailed priors. In: Proceedings of Odyssey Speaker and Language Recognition Workshop. Brno, Czech Republic: 2010.
[10]	Yang Hai, Liang Chun-Yan, Xu Yun-Fei, Yang Lin, Yan Yong-Hong. Sparse probabilistic linear disciminant analysis for speaker verification. In: Proceedings of International Conference on Speech Communication and Technology. Portland, Oregon: IEEE, 2012.
[11]	Dehak N, Dehak R, Kenny P, Brummer N, Ouellet P, Dumouchel P. Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: Proceedings of International Conference on Speech Communication and Technology. Brighton, UK: IEEE, 2009. 1559-1562
[12]	Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41
[13]	Guo Wu, Li Yi-Jie, Dai Li-Rong, Wang Ren-Hua. Factor analysis and space assembling in speaker recognition. Acta Automatica Sinica, 2009, 35(9): 1193-1198(郭武, 李轶杰, 戴礼荣, 王仁华. 说话人识别中的因子分析以及空间拼接. 自动化学报, 2009, 35(9): 1193-1198)
[14]	Kenny P, Boulianne G, Dumouchel P. Eigenvoice modeling with sparse training data. IEEE Transactions on Speech Audio Processing, 2005, 13(3): 345-359
[15]	Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech and Language Processing, 2007, 15(4): 1435-1447
[16]	He Liang, Shi Yong-Zhe, Liu Jia. Eigenchannel space combination method of joint factor analysis. Acta Automatica Sinica, 2011, 37(7): 849-856(何亮, 史永哲, 刘加. 联合因子分析中的本征信道空间拼接方法. 自动化学报, 2011, 37(7): 849-856)
[17]	Dehak N. Discriminative and generative approches for long-and short-term speaker characteristics modeling: Application to speaker verification [Ph.D. dissertation], école de Technologie Supérieure, Montreal, QC, Canada, 2009
[18]	Dehak N, Kenny P, Dehak R, Dumouchel P, Ouellet P. Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 2011, 19(4): 788-798
[19]	Dehak N, Kenny P, Dehak R, Glembek O, Dumouchel P, Burget L, Hubeika V. Support vector machines and joint factor analysis for speaker verification. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Taipei, China: IEEE, 2009. 4237-4240
[20]	NIST speaker recognition evaluation [Online], available: http://www.itl.nist.gov/iad/mig/tests/spk/2010/index.html, April 21, 2010. 325-328
[21]	Schwarz P, Pavel M, Cernocky J. Hierarchical structures of neural networks for phoneme recognition. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Toulouse, France: IEEE, 2006.
[22]	McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 1947, 12(2): 153-157

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133