%0 Journal Article
%T Saddle-free Hessian-free optimization for Deep Learning
%A Martin Arjovsky
%J Computer Science
%D 2015
%I arXiv
%X We develop a variant of the Hessian-free optimization method by Martens (2010) but which implements the saddle-free Newton method (Dauphin et al, 2014) instead of classical Newton. It does this in linear time in the amount of parameters in the network, which makes it scalable to very large problems. It is also easy to use, stable, and does not make any low rank approximation of any version of the Hessian. Finally, it is memory efficient, since it does not store any matrix, and uses only matrix-vector products to solve the problem at hand.
%U http://arxiv.org/abs/1506.00059v1