Zheng Y, Zhang Y J, Larochelle H. A supervised neural autoregressive topic model for simultaneous image classification and annotation: 1305.5306. New York, USA: Cornell University, 2013.
[2]
Larochelle H, Lauly S. A neural autoregressive topic model[C]//Advances in Neural Information Processing Systems. Nevada, United States: MIT Press, 2012: 2717-2725.
[3]
Ji S, Xu W, Yang M, et al. 3D convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1):221-231.
[4]
Baccouche M, Mamalet F, Wolf C, et al. Spatio-temporal convolutional sparse auto-encoder for sequence classification[J]. Networks, 2005, 18(5-6):602-610.
[5]
Taylor G W, Fergus R, LeCun Y, et al. Convolutional learning of spatio-temporal features[C]//Proceedings of the 11th European Conference on Computer Vision. Heraklion, Crete, Greece:Springer, 2010, 6316:140-153.
[6]
Ranzato M A, Huang F J, Boureau Y L, et al. Unsupervised learning of invariant feature hierarchies with applications to object recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, MN:IEEE, 2007: 1-8.
[7]
Zeiler M D, Krishnan D, Taylor G W, et al. Deconvolutional networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA:IEEE, 2010:2528-2535.
[8]
Ranzato M, Susskind J, Mnih V, et al. On deep generative models with applications to recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI:IEEE, 2011:2857-2864.
[9]
Kong B. Comparison between human vision and computer vision[J]. Nature Magazine, 2002, 24(1):51-55.[孔斌.人类视觉与计算机视觉的比较[J].自然杂志, 2002, 24(1):51-55.]
[10]
Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110.
[11]
Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//Proceedings of IEEE Computer Society Conference on in Computer Vision and Pattern Recognition. San Diego, CA, USA:IEEE, 2005, 1:886-893.
[12]
Ojala T, Pietikainen M, Harwood D. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions[C]//Proceedings of ICPR. Jerusalem, Irsael: IEEE, 1994, 1:582-585.
[13]
Matas J, Chum O, Urban M, et al. Robust wide-baseline stereo from maximally stable extremal regions[J]. Image and Vision Computing, 2004, 22(10):761-767.
[14]
Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7):1527-1554.
[15]
Hinton G E. Learning multiple layers of representation[J]. Trends In Cognitive Sciences, 2007, 11(10):428-434.
[16]
Hinton G E, Zemel R S. Autoencoders, minimum description length, and Helmholtz free energy[C]//Advances in Neural Information Processing Systems.Burlington, USA: Morgan Kaufmann, 1994:3-10.
[17]
Rumelhart D E, Hinton G E, Williams R J. Learning Representations by Back-Propagating Errors[M]. Cognitive Modeling: MIT Press, 2002, 1:213.
[18]
Vincent P, Larochelle H, Bengio Y, et al. Extracting and composing robust features with denoising autoencoders[C]//Proceedings of the 25th International Conference on Machine Learning. New York, NY, USA:ACM, 2008:1096-1103.
[19]
Lee H, Grosse R, Ranganath R, et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations[C]//Proceedings of the 26th Annual International Conference on Machine Learning. New York, NY, USA:ACM, 2009:609-616.
[20]
Krizhevsky A, Hinton G E. Factored 3-way restricted boltzmann machines for modeling natural images[C]//Proceedings of International Conference on Artificial Intelligence and Statistics. Sardinia, Italy: 2010: 621-628.
[21]
Ranzato M A, Hinton G E. Modeling pixel means and covariances using factorized third-order boltzmann machines[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA:IEEE, 2010:2551-2558.
[22]
Lee T S, Mumford D, Romero R, et al. The role of the primary visual cortex in higher level vision[J]. Vision Research, 1998, 38(15-16):2429-2454.
[23]
Lee T S, Mumford D. Hierarchical Bayesian inference in the visual cortex[J]. JOSA A, 2003, 20(7):1434-1448.
[24]
Arel I, Rose D C, Karnowski T P. Deep machine learning a new frontier in artificial intelligence research[J]. IEEE Transactions on Computational Intelligence Magazine, 2010, 5(4):13-18.
[25]
Deng J, Berg A, Satheesh S, et. al. Large scale visual recognition challenge[EB/OL].[2013-11-14]. http://www.image-net.org/challenges/LSVRC/2012/index.
[26]
Everingham M, Visual object classes challenge[EB/OL].[2013-11-14]. http://pascallin.ecs.soton.ac.uk/challenges/VOC/.
[27]
Hinton G E, Sejnowski T J, Ackley D H. Boltzmann Machines:Constraint Satisfaction Networks that Learn[M]. Pittsburgh, PA:Carnegie-Mellon University, Department of Computer Science, 1984.
[28]
Andrieu C, de Freitas N, Doucet A, et al. An introduction to MCMC for machine learning[J]. Machine Learning, 2003, 50(1-2):5-43.
[29]
更多...
[30]
Hinton G E.Training products of experts by minimizing contrastive divergence[J]. Neural Computation, 2002, 14(8):1771-1800.
[31]
Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks[C]//Advances in neural information processing systems.Vancouver, BC, Canada: MIT Press, 2007, 19:153.
[32]
Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786):504-507.
[33]
Hinton G E, Welling M, Teh Y W, et al. A new view of ICA[C]//Proceedings of Int. Conf. on Independent Component Analysis and Blind Source Separation. San Diego, CA: 2001:746-751.
[34]
Olshausen B A, Field D J. Sparse coding with an overcomplete basis set:a strategy employed by VI?[J]. Vision Research, 1997, 37(23):3311-3326.
[35]
Rehn M, Sommer F T. A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields[J]. Journal of Computational Neuroscience, 2007, 22(2):135-146.
[36]
Lee H, Ekanadham C, Ng A. Sparse deep belief net model for visual area V2[C]//Proceedings of Advances In Neural Information Processing Systems, Cambridge, MA:MIT Press, 2008:873-880.
[37]
Krizhevsky A, Nair V, Hinton G. The CIFAR-10 Dataset[DB/OL].[2013-11-14].http:[C]//www.cs.toronto.edu/kriz/cifar.html.
[38]
Larochelle H, Murray I. The neural autoregressive distribution estimator[C]//Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, AISTATS 2011. Fort Lauderdale, FL, United states: Microtome Publishing, 2011: 29-37.
[39]
Le Q V, Ramzato M, Monga R, et al. Building high-level features using large scale unsupervised learning, 1112.6209. New York, USA: Cornell University, 2012.
[40]
Wang C, Blei D, Li F F. Simultaneous image classification and annotation[C]//Proceedings of IEEE Conference on in Computer Vision and Pattern Recognition. Miami, FL:IEEE, 2009:1903-1910.
[41]
Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3:993-1022.