|
Computer Science 2015
Distilling Word Embeddings: An Encoding ApproachAbstract: Distilling knowledge from a well-trained cumbersome network to a small one has become a new research topic recently, as lightweight neural networks with high performance are particularly in need in various resource-restricted systems. This paper addresses the problem of distilling embeddings for NLP tasks. We propose an encoding approach to distill task-specific knowledge from high-dimensional embeddings, which can retain high performance and reduce model complexity to a large extent. Experimental results show our method is better than directly training neural networks with small embeddings.
|