|
控制理论与应用 2015
基于深度网络的可学习感受野算法在图像分类中的应用
|
Abstract:
作为图像检索, 图像组织和机器人视觉的基本任务, 图像分类在计算机视觉和机器学习中受到了广泛的关注. 用于目标识别及图像分类的多种基于深度学习的模型同样引发了该领域内的极大兴趣. 本文提出了一种取代尺度不变特征变换(SIFT)和方向梯度直方图(HOG)描述子的算法, 即利用深度分层结构, 按层级学习有效的图像表示, 直接从原始像素点学习特征.该方法分别利用K--奇异值分解(K--SVD)和正交匹配追踪(OMP)进行字典训练和编 码.此外, 本文采用了同时学习分类器和用于池化的感受野方案. 实验结果证明, 上述算法在目标(Oxford flowers)和事件(UIUC--sports)图像分类测试集中取得了更好的分类性能.
An increasing interest in computer vision and machine learning has focused on visual categorization as it is a fundamental task for image retrieval, organization and robotic vision. Over the past decade, various deep learningbased models have been proposed and broadly applied to visual recognition and categorization. In this paper, the proposed approach learns features from scratch rather than employ hand-crafted (SIFT) and (HOG) descriptors. Deep hierarchical architecture for learning effective image representations can be built up layer by layer. Specifically, K--SVD and OMP are used for training and encoding phase respectively due to their simplicity and efficiency. In addition, sum, average and max operators are three commonly strategies for pooling in modern categorization models. We aim to apply an improved scheme which learns the receptive fields for pooling together with classifier instead of traditional pooling pattern. We provide a detailed analysis in deep networks for event and object tasks respectively and compare our novel method with several stateof- the-art algorithms comprising kernel-based feature learning and saliency-weighted hierarchical sparse coding. Finally, experimental results show that our algorithm performs better on UIUC--sports and Oxford flowers datasets.