OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

自动化学报 2014

面向口语统计语言模型建模的自动语料生成算法

DOI: 10.3724/SP.J.1004.2014.02808, PP. 2808-2814

司玉景, 肖业鸣, 徐及, 潘接林, 颜永红

Keywords: 自动语音识别,资源匮乏,语言模型,等概率事件,语料生成算法

Full-Text Cite this paper Add to My Lib

Abstract:

？在资源相对匮乏的自动语音识别(Automaticspeechrecognition,ASR)领域,如面向电话交谈的语音识别系统中,统计语言模型(Languagemodel,LM)存在着严重的数据稀疏问题.本文提出了一种基于等概率事件的采样语料生成算法,自动生成领域相关的语料,用来强化统计语言模型建模.实验结果表明,加入本算法生成的采样语料可以缓解语言模型的稀疏性,从而提升整个语音识别系统的性能.在开发集上语言模型的困惑度相对降低7.5%,字错误率(Charactererrorrate,CER)绝对降低0.2个点;在测试集上语言模型的困惑度相对降低6%,字错误率绝对降低0.4点.

References

[1]	Chen S F, Goodman J. An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics. Santa Cruz, CA, 1996. 310-318
[2]	Khudanpur S, Wu J. A maximum entropy language model integrating n-grams and topic dependencies for conversational speech recognition. In: Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Phoenix, AZ: IEEE, 1999. 553-556
[3]	Mikolov T, Karafiát M, Burget L, ？ernocky J H, Khudanpur S. Recurrent neural network based language model. In: Proceedings of the 2010 INTERSPEECH. Lyon, France: ISCA, 2010. 1045-1048
[4]	Liu X, Wang Y, Chen X, Gales M J F, Woodland P C. Efficient lattice rescoring using recurrent neural network language models. In: Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). FLORENCE, ITALY, 2014. 4941-4945
[5]	Huang Yun-Zhu, Wei Wei, Luo Yang-Yu, Li Cheng-Rong. Word-class expansion method about training corpus of language modal in restrcited domain. Application of Computer System, 2011, 20(11): 55-58 (黄韵竹, 韦玮, 罗杨宇, 李成荣. 限定领域语言模型训练语料的词类扩展方法. 计算机系统应用, 2011, 20(11): 55-58)
[6]	Si Y J, Zhang Z, Li T, Pan J, Yan Y. Enhanced word classing for recurrent neural network language model. Journal of Information & Computational Science, 2013, 10(12): 3595-3604
[7]	Mikolov T, Kombrink S, Deoras A, Burget L, Cernocky J H. RNNLM-Recurrent neural network language modeling toolkit. In: Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition and Understanding, UK, 2011. 16-19
[8]	Yang Xing-Jun, Chi Hui-Sheng. Digital Processing of Speech Signals. Beijing: Electronic Industry Press, 1995. 330-331(杨行竣, 迟惠生. 语音信号数字处理. 北京: 电子工业出版牡, 1995. 330-331)
[9]	Allauzen C, Riley M. Bayesian language model interpolation for mobile speech input. In: Proceedings of the 2011 Interspeech. Italy, 2011. 1429-1432
[10]	Schwenk H. CSLM —— a modular open-source continuous space language modeling toolkit. In: Proceedings of the 2013 Interspeech. Lyyon, France, 2013. 1198-1202
[11]	Mikolov T, Deoras A, Kombrink S, Burget L, Cernocky J H. Empirical evaluation and combination of advanced language modeling techniques. In: Proceedings of the 2011 Interspeech. Italy, 2011. 605-608
[12]	Bengio Y, Boulanger-Lewandowski N, Pascanu R. Advances in optimizing recurrent networks. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vancouver, Canada: IEEE, 2013. 8624-8628
[13]	Sutskever Ilya. Training Recurrent Neural Networks [Ph.D. dissertation], University of Toronto, Canada, 2013.
[14]	Shao J, Li T, Zhang Q Q, Zhao Q W, Yan Y H. A one-pass real-time decoder using memory-efficient state network. IEICE Transactions on Information and Systems, 2008, 1(91): 529-537
[15]	Shao Jian. Chinese Spoken Term Detection towards Large-Scale Telephone Conversational Speech [Ph.D. dissertation]. Institute of Acoustics, Chinese Academy of Sciences, China, 2008. (邵建. 面向大规模电话交谈语音的汉语语音检索[博士学位论文], 中国科学院声学研究所, 中国, 2008.)

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133