Han J Q, Zhang L, Zheng T R. Speech Signal Processing. Beijing, China: Tsinghua University Press, 2005 (in Chinese)(韩纪庆,张 磊,郑铁然.语音信号处理.北京:清华大学出版社,2005)
[2]
Schwarz P. Phoneme Recognition Based on Long Temporal Context[EB/OL]. [2013-07-10]. http://speech.fit.vutbr.cz/software/phoneme-recognizer-based-long-temporal-context
[3]
Jansen A, Niyogi P. Point Process Models for Spotting Keywords in Continuous Speech. IEEE Trans on Audio, Speech, and Language Processing, 2009, 17(8): 1457-1470
[4]
Matějka P, Schwarz P, Cˇernock J, et al. Phonotactic Language Identification Using High Quality Phoneme Recognition // Proc of the 9th European Conference on Speech Communication and Technology. Lisbon, Portugal, 2005: 2237-2240
[5]
Hinton G E, Salakhutdinov R R. Reducing the Dimensionality of Data with Neural Networks. Science, 2006, 313(5786): 504-507
[6]
Deng L. An Overview of Deep-Structured Learning for Information Processing // Proc of the Asian-Pacific Signal and Information Processing Association Annual Summit and Conference. Xi′an, China, 2011: 1-14
[7]
Sivaram G S V S, Hermansky H. Sparse Multilayer Perceptron for Phoneme Recognition. IEEE Trans on Audio, Speech, and Language Processing, 2012, 20(1): 23-29
[8]
Yu D, Seide F, Li G, et al. Exploiting Sparseness in Deep Neural Networks for Large Vocabulary Speech Recognition // Proc of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Kyoto, Japan, 2012: 4409-4412
[9]
Luo H. Restricted Boltzmann Machines: A Collaborative Filtering Perspective. Ph.D Dissertation. Shanghai, China: Shanghai Jiao Tong University, 2011(in Chinese)(罗 恒.基于协同过滤视角的受限玻尔兹曼机研究.博士学位论文.上海:上海交通大学, 2011)
[10]
Mohamed A, Dahl G E, Hinton G. Acoustic Modeling Using Deep Belief Networks. IEEE Trans on Audio, Speech, and Language Processing, 2012, 20(1): 14-22
[11]
Siniscalchi S M, Yu D, Deng L, et al. Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model. IEEE Signal Processing Letters, 2013, 20(3): 201-204
[12]
Bergstra J, Breuleux O, Bastien F, et al. Theano: A CPU and GPU Math Compiler in Python [EB/OL]. [2013-07-01]. http://www.iro.umontreal.ca/~lisa/pointeurs/theano-scipy2010.pdf
[13]
Yu D, Seltzer M. Improved Bottleneck Features Using Pretrained Deep Neural Networks // Proc of the 12th Annual Conference of the International Speech Communication Association. Florence, Italy, 2011: 237-240
[14]
Grézl F, Karafiát M, Kontár S, et al. Probabilistic and Bottle-Neck Features for LVCSR of Meetings // Proc of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Honolulu, USA, 2007, IV: 757-760
[15]
Mohamed A, Sainath T N, Dahl G, et al. Deep Belief Networks Using Discriminative Features for Phone Recognition // Proc of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague, Czech Republic, 2011: 5060-5063