%0 Journal Article
%T Distilling Word Embeddings: An Encoding Approach
%A Lili Mou
%A Ge Li
%A Yan Xu
%A Lu Zhang
%A Zhi Jin
%J Computer Science
%D 2015
%I arXiv
%X Distilling knowledge from a well-trained cumbersome network to a small one has become a new research topic recently, as lightweight neural networks with high performance are particularly in need in various resource-restricted systems. This paper addresses the problem of distilling embeddings for NLP tasks. We propose an encoding approach to distill task-specific knowledge from high-dimensional embeddings, which can retain high performance and reduce model complexity to a large extent. Experimental results show our method is better than directly training neural networks with small embeddings.
%U http://arxiv.org/abs/1506.04488v1