Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Do Deep Nets Really Need to be Deep?  [PDF]
Lei Jimmy Ba,Rich Caruana
Computer Science , 2013,
Abstract: Currently, deep neural networks are the state of the art on problems such as speech recognition and computer vision. In this extended abstract, we show that shallow feed-forward networks can learn the complex functions previously learned by deep nets and achieve accuracies previously only achievable with deep models. Moreover, in some cases the shallow neural nets can learn these deep functions using a total number of parameters similar to the original deep model. We evaluate our method on the TIMIT phoneme recognition task and are able to train shallow fully-connected nets that perform similarly to complex, well-engineered, deep convolutional architectures. Our success in training shallow neural nets to mimic deeper models suggests that there probably exist better algorithms for training shallow feed-forward nets than those currently available.
Deep Belief Nets for Topic Modeling  [PDF]
Lars Maaloe,Morten Arngren,Ole Winther
Computer Science , 2015,
Abstract: Applying traditional collaborative filtering to digital publishing is challenging because user data is very sparse due to the high volume of documents relative to the number of users. Content based approaches, on the other hand, is attractive because textual content is often very informative. In this paper we describe large-scale content based collaborative filtering for digital publishing. To solve the digital publishing recommender problem we compare two approaches: latent Dirichlet allocation (LDA) and deep belief nets (DBN) that both find low-dimensional latent representations for documents. Efficient retrieval can be carried out in the latent representation. We work both on public benchmarks and digital media content provided by Issuu, an online publishing platform. This article also comes with a newly developed deep belief nets toolbox for topic modeling tailored towards performance evaluation of the DBN model and comparisons to the LDA model.
Bigeometric Organization of Deep Nets  [PDF]
Alexander Cloninger,Ronald R. Coifman,Nicholas Downing,Harlan M. Krumholz
Computer Science , 2015,
Abstract: In this paper, we build an organization of high-dimensional datasets that cannot be cleanly embedded into a low-dimensional representation due to missing entries and a subset of the features being irrelevant to modeling functions of interest. Our algorithm begins by defining coarse neighborhoods of the points and defining an expected empirical function value on these neighborhoods. We then generate new non-linear features with deep net representations tuned to model the approximate function, and re-organize the geometry of the points with respect to the new representation. Finally, the points are locally z-scored to create an intrinsic geometric organization which is independent of the parameters of the deep net, a geometry designed to assure smoothness with respect to the empirical function. We examine this approach on data from the Center for Medicare and Medicaid Services Hospital Quality Initiative, and generate an intrinsic low-dimensional organization of the hospitals that is smooth with respect to an expert driven function of quality.
Deep Fishing: Gradient Features from Deep Nets  [PDF]
Albert Gordo,Adrien Gaidon,Florent Perronnin
Computer Science , 2015,
Abstract: Convolutional Networks (ConvNets) have recently improved image recognition performance thanks to end-to-end learning of deep feed-forward models from raw pixels. Deep learning is a marked departure from the previous state of the art, the Fisher Vector (FV), which relied on gradient-based encoding of local hand-crafted features. In this paper, we discuss a novel connection between these two approaches. First, we show that one can derive gradient representations from ConvNets in a similar fashion to the FV. Second, we show that this gradient representation actually corresponds to a structured matrix that allows for efficient similarity computation. We experimentally study the benefits of transferring this representation over the outputs of ConvNet layers, and find consistent improvements on the Pascal VOC 2007 and 2012 datasets.
Direct Loss Minimization for Training Deep Neural Nets  [PDF]
Yang Song,Alexander G. Schwing,Richard S. Zemel,Raquel Urtasun
Computer Science , 2015,
Abstract: Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on specific application-specific metrics. In this paper we proposed a direct loss minimization approach to train deep neural networks, taking into account the application-specific loss functions. This can be non-trivial, when these functions are non-smooth and non-decomposable. We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we propose a dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection.
Intrusion detection model based on deep belief nets  [PDF]
Gao Ni, Gao Ling, He Yiyue, , Gao Quanli, Ren Jie
- , 2015, DOI: 10.3969/j.issn.1003-7985.2015.03.007
Abstract: This paper focuses on the intrusion classification of huge amounts of data in a network intrusion detection system. An intrusion detection model based on deep belief nets(DBN)is proposed to conduct intrusion detection, and the principles regarding DBN are discussed. The DBN is composed of a multiple unsupervised restricted Boltzmann machine(RBM)and a supervised back propagation(BP)network. First, the DBN in the proposed model is pre-trained in a fast and greedy way, and each RBM is trained by the contrastive divergence algorithm. Secondly, the whole network is fine-tuned by the supervised BP algorithm, which is employed for classifying the low-dimensional features of the intrusion data generated by the last RBM layer simultaneously. The experimental results on the KDD CUP 1999 dataset demonstrate that the DBN using the RBM network with three or more layers outperforms the self-organizing maps(SOM)and neural network(NN)in intrusion classification. Therefore, the DBN is an efficient approach for intrusion detection in high-dimensional space.
Learning Sparse Feature Representations using Probabilistic Quadtrees and Deep Belief Nets  [PDF]
Saikat Basu,Manohar Karki,Sangram Ganguly,Robert DiBiano,Supratik Mukhopadhyay,Ramakrishna Nemani
Computer Science , 2015,
Abstract: Learning sparse feature representations is a useful instrument for solving an unsupervised learning problem. In this paper, we present three labeled handwritten digit datasets, collectively called n-MNIST. Then, we propose a novel framework for the classification of handwritten digits that learns sparse representations using probabilistic quadtrees and Deep Belief Nets. On the MNIST and n-MNIST datasets, our framework shows promising results and significantly outperforms traditional Deep Belief Networks.
Why are deep nets reversible: A simple theory, with implications for training  [PDF]
Sanjeev Arora,Yingyu Liang,Tengyu Ma
Computer Science , 2015,
Abstract: Generative models for deep learning are promising both to improve understanding of the model, and yield training methods requiring fewer labeled samples. Recent works use generative model approaches to produce the deep net's input given the value of a hidden layer several levels above. However, there is no accompanying "proof of correctness" for the generative model, showing that the feedforward deep net is the correct inference method for recovering the hidden layer given the input. Furthermore, these models are complicated. The current paper takes a more theoretical tack. It presents a very simple generative model for RELU deep nets, with the following characteristics: (i) The generative model is just the reverse of the feedforward net: if the forward transformation at a layer is $A$ then the reverse transformation is $A^T$. (This can be seen as an explanation of the old weight tying idea for denoising autoencoders.) (ii) Its correctness can be proven under a clean theoretical assumption: the edge weights in real-life deep nets behave like random numbers. Under this assumption ---which is experimentally tested on real-life nets like AlexNet--- it is formally proved that feed forward net is a correct inference method for recovering the hidden layer. The generative model suggests a simple modification for training: use the generative model to produce synthetic data with labels and include it in the training set. Experiments are shown to support this theory of random-like deep nets; and that it helps the training.
Return of the Devil in the Details: Delving Deep into Convolutional Nets  [PDF]
Ken Chatfield,Karen Simonyan,Andrea Vedaldi,Andrew Zisserman
Computer Science , 2014,
Abstract: The latest generation of Convolutional Neural Networks (CNN) have achieved impressive results in challenging benchmarks on image recognition and object detection, significantly raising the interest of the community in these methods. Nevertheless, it is still unclear how different CNN methods compare with each other and with previous state-of-the-art shallow representations such as the Bag-of-Visual-Words and the Improved Fisher Vector. This paper conducts a rigorous evaluation of these new techniques, exploring different deep architectures and comparing them on a common ground, identifying and disclosing important implementation details. We identify several useful properties of CNN-based representations, including the fact that the dimensionality of the CNN output layer can be reduced significantly without having an adverse effect on performance. We also identify aspects of deep and shallow methods that can be successfully shared. In particular, we show that the data augmentation techniques commonly applied to CNN-based methods can also be applied to shallow methods, and result in an analogous performance boost. Source code and models to reproduce the experiments in the paper is made publicly available.
DEEP-CARVING: Discovering Visual Attributes by Carving Deep Neural Nets  [PDF]
Sukrit Shankar,Vikas K. Garg,Roberto Cipolla
Computer Science , 2015,
Abstract: Most of the approaches for discovering visual attributes in images demand significant supervision, which is cumbersome to obtain. In this paper, we aim to discover visual attributes in a weakly supervised setting that is commonly encountered with contemporary image search engines. Deep Convolutional Neural Networks (CNNs) have enjoyed remarkable success in vision applications recently. However, in a weakly supervised scenario, widely used CNN training procedures do not learn a robust model for predicting multiple attribute labels simultaneously. The primary reason is that the attributes highly co-occur within the training data. To ameliorate this limitation, we propose Deep-Carving, a novel training procedure with CNNs, that helps the net efficiently carve itself for the task of multiple attribute prediction. During training, the responses of the feature maps are exploited in an ingenious way to provide the net with multiple pseudo-labels (for training images) for subsequent iterations. The process is repeated periodically after a fixed number of iterations, and enables the net carve itself iteratively for efficiently disentangling features. Additionally, we contribute a noun-adjective pairing inspired Natural Scenes Attributes Dataset to the research community, CAMIT - NSAD, containing a number of co-occurring attributes within a noun category. We describe, in detail, salient aspects of this dataset. Our experiments on CAMIT-NSAD and the SUN Attributes Dataset, with weak supervision, clearly demonstrate that the Deep-Carved CNNs consistently achieve considerable improvement in the precision of attribute prediction over popular baseline methods.
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.