Schwenk H. Continuous Space Language Models. Computer Speech and Language, 2007, 21(3): 492-518
[2]
Bengio Y, Ducharme R, Vincent P, et al. A Neural Probabilistic Language Model. Journal of Machine Learning Research, 2003, 3: 1137-1155
[3]
Mikolov T, Karafiát M, Burget L, et al. Recurrent Neural Network Based Language Model // Proc of the 11th Annual Conference of the International Speech Communication Association. Makuhari, Japan, 2010: 1045-1048
[4]
Mikolov T, Kombrink S, Burget L, et al. Extensions of Recurrent Neural Network Language Model // Proc of theInternational Conference on Acoustics, Speech and Signal Processing. Prague,Czech Republic, 2011: 5528-5531
[5]
Bengio Y, Simard P, Frasconi P. Learning Long-Term Dependencies with Gradient Descent Is Difficult.Trans on Neural Networks, 1994, 5(2): 157-166
[6]
Son L H, Allauzen A, Yvon F. Measuring the Influence of Long Range Dependencies with Neural Network Language Models // Proc of the NAACL-HLT Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT. Mon-treal, Canada, 2012: 1-10
[7]
Martens J, Sutskever I. Learning Recurrent Neural Networks with Hessian-Free Optimization [EB/OL].[2014-02-10]. http://www.icml-2011.org/papers/532_icmlpaper.pdf
[8]
Sundermeyer M, Schlüter R, Ney H. LSTM Neural Networks for Language Modeling[EB/OL].[2014-02-10]. http://www-i6.informatik.rwth-aachen.de/publications/download/820/Sundermeyer-2012.pdf
[9]
Shi Y, Wiggers P, Jonker C M. Towards Recurrent Neural Networks Language Models with Linguistic and Contextual Features // Proc of the 13th Annual Conference of the International Speech Communication Association. Portland, USA, 2012: 1664-1667
[10]
Auli M, Galley M, Quirk C, et al. Joint Language and Translation Modeling with Recurrent Neural Networks // Proc of the Confe-rence on Empirical Methods in Natural Language Processing. Sea-ttle, USA, 2013: 1044-1054
[11]
Yao K, Zweig G, Hwang M Y, et al. Recurrent Neural Networks for Language Understanding [EB/OL]. [2014-02-10]. http://research.microsoft.com/pubs/200236/RNN4LU.pdf
[12]
Hinton G E. Learning Distributed Representations of Concepts // Proc of the 8th Annual Conference of the Cognitive Science Society. Amherst, USA, 1986: 1-12
[13]
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[EB/OL]. [2014-02-10]. http://arxiv.org/pdf/1301.3781.pdf
[14]
Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality [EB/OL].[2014-02-10]. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
[15]
Marcus M P, Marcinkiewicz M A, Santorini B. Building a Large Annotated Corpus of English: the Penn Treebank. Computational Linguistics, 1993, 19(2): 313-330
[16]
Mikolov T, Deoras A, Kombrink S, et al. Empirical Evaluation and Combination of Advanced Language Modeling Techniques [EB/OL]. [2014-02-14]. http://www.fit.vutbr.cz/~imikolov/~rnnlm/is 2011_emp.pdf
[17]
Povey D, Ghoshal A, Boulianne G, et al. The Kaldi Speech Re-cognition Toolkit [EB/OL].[2014-02-10]. http://homepages.inf.ed.ac.uk/aghoshal/pubs/asru11-kaldi.pdf