OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

自动化学报 2012

区分性模型组合中基于决策树的声学上下文建模方法

DOI: 10.3724/SP.J.1004.2012.01449, PP. 1449-1458

黄浩, 李兵虎, 吾守尔·斯拉木

Keywords: 区分性模型组合,上下文建模,声学决策树,最小音子错误,语音识别

Full-Text Cite this paper Add to My Lib

Abstract:

？上下文相关的区分性模型组合的局限在于引入大的模型权重参数集,在数据有限时容易导致区分性权重训练过拟合.针对该问题,本文提出利用决策树进行上下文建模,采用最小音子错误准则构建决策树以获得最优上下文相关权重参数集.决策树构造过程中通过评估目标函数的一阶近似增量来加速最优问题集的选择,并利用精细问题集来获得更好的声学区分能力.基于多模型组合的语音识别实验表明,该方法能够增强权重训练对过拟合的鲁棒性,在大幅减小参数数量的情况下降低误识率,并优于在特征空间进行组合的方法.

References

[1]	Beyerlein P. Discriminative model combination. In: Proceedings of the 1997 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). Santa Barbara, USA: IEEE, 1997. 238-245
[2]	Hoffmeister B, Liang R Y, Schlüter R, Ney H. Log-linear model combination with word-dependent scaling factors. In: Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech). Brighton, UK: Interspeech, 2009. 248-541
[3]	Liu X Y, Gales M J F, Hieronymus J L, Woodland P C. Language model combination and adaptation using weighted finite state transducers. In: Proceedings of the 2010 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Dallas, USA: IEEE, 2010. 5390-5393
[4]	Povey D, Woodland P C. Minimum phone error and I-smoothing for improved discriminative training. In: Proceedings of the 2002 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Orlando, USA: IEEE, 2002. 105-108
[5]	Wu Ya-Hui, Liu Gang, Guo Jun. Research on model combination based on model confusion. Acta Automatica Sinica, 2009, 35(5): 551-555 (吴娅辉, 刘刚, 郭军. 基于模型混淆度的模型组合算法研究. 自动化学报, 2009, 35(5): 551-555)
[6]	Gao S, Lee C H. A discriminative decision tree learning approach to acoustic modeling. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech). Geneva, Switzerland: ISCA, 2003. 1833- 1836
[7]	Chang E, Shi Y, Zhou J L, Huang C. Speech lab in a box: a Mandarin speech toolbox to jumpstart speech related research. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech). Aalborg, Denmark: ISCA, 2001. 2779-2782
[8]	Huang Hao, Abudureyimu Halidan. Rapid parameter updating algorithm for large margin Gaussian mixture model. Computer Engineering, 2010, 36(3): 197-199(黄浩, 哈力旦·阿布都热依木. 大间隔高斯混合模型的快速参数更新算法. 计算机工程, 2010, 36(3): 197-199)
[9]	Huang H, Zhu J. Discriminative incorporation of explicitly trained tone models into lattice based rescoring for Mandarin speech recognition. In: Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Las Vagas, USA: IEEE, 2008. 1541- 1544
[10]	Liu X Y, Gales M J F, Woodland P C. Use of contexts in language model interpolation and adaptation. In: Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech). Brighton, UK: Interspeech, 2009. 360-363
[11]	Ellis D P W, Singh R, Sivadas S. Tandem acoustic modeling in large-vocabulary recognition. In: Proceedings of the 2001 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Salt Lake City, USA: IEEE, 2001. 1201-1204
[12]	Povey D. Discriminative Training for Large Vocabulary Speech Recognition [Ph.D. dissertation], Cambridge University, UK, 2004
[13]	Young S J, Odell J J, Woodland P C. Tree-based state tying for high accuracy acoustic modelling. In: Proceedings of the 1994 Workshop on Human Language Technology. Stroudsburg, USA: ACL, 1994. 307-312
[14]	Wiesler S, Heigold G, Nuβ baum-Thom M, Schlüter R, Ney H. A discriminative splitting criterion for phonetic decision trees. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech). Makuhari, Japan: ISCA, 2010. 54-57
[15]	Qian Y, Lee T, Li Y J. Overlapped ditone modeling for tone recognition in continuous Cantonese speech. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech). Geneva, Switzerland: ISCA, 2003. 1845-1848
[16]	The ICSI Quicknet Tools [Online], available: http://www. icsi.berkeley.edu/Speech/qn.html, March 15, 2012

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133