|
- 2016
多层独立子空间分析时空特征的人体行为识别方法
|
Abstract:
人体行为识别在视频监控、医疗诊断等领域都有重要的意义。目前人体识别的主要方法是将人为设计的二维特征扩展到三维空间,或利用运动轨迹,提取出时空特征。基于深度学习的思想,直接在三维空间中构建多层神经网络,从大量的视频数据中学习不同行为的时空特征。首先,采用独立子空间分析(independent subspace analysis,ISA)方法,构造两层卷积叠加神经网络,从训练视频中学习网络权重。然后,对特征使用K-means聚类,转化为视觉单词,根据视觉单词频率直方图计算支持向量机模型(support vector machine,SVM)判决超平面,最后对待分析视频进行动作分类。使用该方法对Hollywood2数据库的12种行为进行实验,结果表明,ISA学习到的特征权重与Gabor滤波器类似,对图像频率和方向具有明显的选择性,对相位变化具有鲁棒性,能够显著提高认为识别的正确率,符合人眼的视觉特征
[1] | Le Q V. Building High-level Features Using Large Scale Unsupervised Learning[C]. The Acoustics, Speech and Signal Processing(ICASSP), Vancouver, USA, 2013 |
[2] | Deng L, Abdel-Hamid O, Yu D. A Deep Convolutional Neural Network Using Heterogeneous Pooling for Trading Acoustic Invariance with Phonetic Confusion[C].Acoustics, Speech and Signal Processing(ICASSP), Vancouver, USA, 2013 |
[3] | Le Q V, Zou W Y, Yeung S Y, et al. Learning Hierarchical Invariant Spatio-temporal Features for Action Recognition with Independent Subspace Analysis[C]. The Computer Vision and Pattern Recognition(CVPR), 2011 IEEE Conference on, Colorado Springs, US, 2011 |
[4] | Liu Yu,Kang Chaogui,Wang Fahui. Towards Big Data-driven Human Mobility Patterns and Models[J]. Geomatics and Information Science of Wuhan University, 2014, 39(6):274-277(刘瑜, 康朝贵, 王法辉. 大数据驱动的人类移动模式和模型研究[J]. 武汉大学学报·信息科学版, 2014, 39(6):274-277) |
[5] | Liu Hui,Li Qingquan,Gao Chunxian,et al. Moving Target Detection Using C_SURF Registration[J]. Geomatics and Information Science of Wuhan University, 2014, 39(8):951-955(刘慧, 李清泉, 高春仙, 等. 利用C_SURF配准的空基视频运动目标检测[J]. 武汉大学学报·信息科学版, 2014, 39(8):951-955) |
[6] | Han D, Bo L,Sminchisescu C. Selection and Context for Action Recognition[C].Computer Vision, 2009 IEEE 12th International Conference on, Kyoto, Japan, 2009 |
[7] | Jain M, Jégou H, Bouthemy P. Better Exploiting Motion for Better Action Recognition[C].Computer Vision and Pattern Recognition(CVPR), Portland, OR, USA, 2013 |
[8] | Hyv R A, Hurri J, Hoyer P O. Natural Image Statistics:A Probabilistic Approach to Early Computational Vision[M]. Berlin:Springer, 2009 |
[9] | Gilbert A, Illingworth J, Bowden R. Action Recognition Using Mined Hierarchical Compound Features[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2011, 33(5):883-897 |
[10] | Li Deren, Yao Yuan, Shao Zhenfeng. Big Data in Smart City[J]. Geomatics and Information Science of Wuhan University, 2014, 39(6):613-640(李德仁, 姚远, 邵振峰. 智慧城市中的大数据[J]. 武汉大学学报·信息科学版, 2014, 39(6):613-640) |
[11] | Yin S, Liu C, Zhang Z, et al. Noisy Training for Deep Neural Networks in Speech Recognition[J]. Eurasip Journal on Audio Speech & Music Processing, 2015,1:1-14 |
[12] | Wang Kai, Shu Ning, Li Liang, et al. Weighted Hyperspectral Image Target Detection Algorithm Based on ICA Orthogonal Subspace Projection[J]. Geomatics and Information Science of Wuhan University, 2013, 38(4):440-444(王凯, 舒宁, 李亮, 等. 利用ICA正交子空间投影加权的高光谱影像目标探测算法[J]. 武汉大学学报·信息科学版, 2013,38(4):440-444) |
[13] | Wang H, Ullah M M, Klaser A, et al. Evaluation of Local Spatio-temporal Features for Action Recognition[C]. The BMVC 2009-British Machine Vision Conference, London, 2009 |