%0 Journal Article %T Parallel Dither and Dropout for Regularising Deep Neural Networks %A Andrew J. R. Simpson %J Computer Science %D 2015 %I arXiv %X Effective regularisation during training can mean the difference between success and failure for deep neural networks. Recently, dither has been suggested as alternative to dropout for regularisation during batch-averaged stochastic gradient descent (SGD). In this article, we show that these methods fail without batch averaging and we introduce a new, parallel regularisation method that may be used without batch averaging. Our results for parallel-regularised non-batch-SGD are substantially better than what is possible with batch-SGD. Furthermore, our results demonstrate that dither and dropout are complimentary. %U http://arxiv.org/abs/1508.07130v1