|
CDNet:空间向量用于双视图对应学习的研究
|
Abstract:
特征匹配是计算机视觉中的一项基本而重要的任务,目的是在给定的一对图像之间找到正确的对应关系(即内线)。严格地说,特征匹配通常包括四个步骤,即特征提取、特征描述、建立初始对应集和去除虚假对应(即离群值去除)。然而,现有的方法单纯考虑到了对应点之间的联系,而忽视了场景图片中可以获取的视觉信息。在本文中,我们提出了一种新型剪枝框架Context Depth Net (CDNet)来准确识别内线和恢复相机姿态。我们从对应点中提取方向信息作为提示方法指导剪枝操作,并利用向量场更好地挖掘对应之间的深层空间信息,最后设计一组融合模块来使空间信息更好融合。实验表明,所提出的CDNet在室内室外数据集上的测试结果优于先前提出的方法。
Feature matching is a fundamental and crucial task in computer vision, aiming to find the correct correspondences (i.e., inliers) between a given pair of images. Strictly speaking, feature matching typically involves four steps: feature extraction, feature description, establishing an initial set of correspondences, and removing false correspondences (i.e., outlier removal). However, existing methods only consider the connections between corresponding points while neglecting the visual information that can be obtained from scene images. In this paper, we propose a novel pruning framework called Context Depth Net (CDNet) to accurately identify inliers and recover camera poses. We extract directional information from corresponding points as a cue to guide the pruning process, and utilize vector fields to better mine the deep spatial information between correspondences. Finally, we design a set of fusion modules to better integrate the spatial information. Experiments show that the proposed CDNet performs better on indoor and outdoor datasets than previously proposed methods.
[1] | Havlena, M. and Schindler, K. (2014) VocMatch: Efficient Multiview Correspondence for Structure from Motion. Computer Vision—ECCV 2014, Zurich, 6-12 September 2014, 46-60. https://doi.org/10.1007/978-3-319-10578-9_4 |
[2] | Mur-Artal, R., Montiel, J.M.M. and Tardos, J.D. (2015) ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics, 31, 1147-1163. https://doi.org/10.1109/tro.2015.2463671 |
[3] | Ma, J., Ma, Y. and Li, C. (2019) Infrared and Visible Image Fusion Methods and Applications: A Survey. Information Fusion, 45, 153-178. https://doi.org/10.1016/j.inffus.2018.02.004 |
[4] | Fischler, M.A. and Bolles, R.C. (1981) Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Communications of the ACM, 24, 381-395. https://doi.org/10.1145/358669.358692 |
[5] | Ni, K., Jin, H. and Dellaert, F. (2009) GroupSAC: Efficient Consensus in the Presence of Groupings. 2009 IEEE 12th International Conference on Computer Vision, Kyoto, 29 September-2 October 2009, 2193-2200. https://doi.org/10.1109/iccv.2009.5459241 |
[6] | Raguram, R., Chum, O., Pollefeys, M., Matas, J. and Frahm, J. (2013) USAC: A Universal Framework for Random Sample Consensus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 2022-2038. https://doi.org/10.1109/tpami.2012.257 |
[7] | Fragoso, V., Sen, P., Rodriguez, S. and Turk, M. (2013) EVSAC: Accelerating Hypotheses Generation by Modeling Matching Scores with Extreme Value Theory. 2013 IEEE International Conference on Computer Vision, Sydney, 1-8 December 2013, 2472-2479. https://doi.org/10.1109/iccv.2013.307 |
[8] | Ma, J., Jiang, X., Fan, A., Jiang, J. and Yan, J. (2020) Image Matching from Handcrafted to Deep Features: A Survey. International Journal of Computer Vision, 129, 23-79. https://doi.org/10.1007/s11263-020-01359-2 |
[9] | Yi, K.M., Trulls, E., Ono, Y., Lepetit, V., Salzmann, M. and Fua, P. (2018) Learning to Find Good Correspondences. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 2666-2674. https://doi.org/10.1109/cvpr.2018.00282 |
[10] | Zhang, J., Sun, D., Luo, Z., Yao, A., Zhou, L., Shen, T., et al. (2019) Learning Two-View Correspondences and Geometry Using Order-Aware Network. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 5844-5853. https://doi.org/10.1109/iccv.2019.00594 |
[11] | Zhao, C., Ge, Y., Zhu, F., Zhao, R., Li, H. and Salzmann, M. (2021) Progressive Correspondence Pruning by Consensus Learning. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 6444-6453. https://doi.org/10.1109/iccv48922.2021.00640 |
[12] | Liu, Y., Liu, L., Lin, C., Dong, Z. and Wang, W. (2021) Learnable Motion Coherence for Correspondence Pruning. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 3236-3245. https://doi.org/10.1109/cvpr46437.2021.00325 |
[13] | Lowe, D.G. (2004) Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60, 91-110. https://doi.org/10.1023/b:visi.0000029664.99615.94 |
[14] | DeTone, D., Malisiewicz, T. and Rabinovich, A. (2018) SuperPoint: Self-Supervised Interest Point Detection and Description. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, 18-22 June 2018, 224-236. https://doi.org/10.1109/cvprw.2018.00060 |
[15] | Zhang, S. and Ma, J. (2024) ConvMatch: Rethinking Network Design for Two-View Correspondence Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46, 2920-2935. https://doi.org/10.1109/TPAMI.2023.3334515 |
[16] | Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P. and Bengio, Y. (2017) Graph Attention Networks. arXiv: 1710.10903. https://doi.org/10.48550/arXiv.1710.10903 |
[17] | Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M. and Solomon, J.M. (2019) Dynamic Graph CNN for Learning on Point Clouds. ACM Transactions on Graphics, 38, 1-12. https://doi.org/10.1145/3326362 |
[18] | Hartley, R. and Zisserman, A. (2004) Multiple View Geometry in Computer Vision. 2nd Edition, Cambridge University Press. https://doi.org/10.1017/cbo9780511811685 |
[19] | Ranftl, R. and Koltun, V. (2018) Deep Fundamental Matrix Estimation. Computer Vision—ECCV 2018, Munich, 8-14 September 2018, 292-309. https://doi.org/10.1007/978-3-030-01246-5_18 |
[20] | Thomee, B., Shamma, D.A., Friedland, G., Elizalde, B., Ni, K., Poland, D., et al. (2016) YFCC100M: The New Data in Multimedia Research. Communications of the ACM, 59, 64-73. https://doi.org/10.1145/2812802 |
[21] | Xiao, J., Owens, A. and Torralba, A. (2013) SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels. 2013 IEEE International Conference on Computer Vision, Sydney, 1-8 December 2013, 1625-1632. https://doi.org/10.1109/iccv.2013.458 |
[22] | Rublee, E., Rabaud, V., Konolige, K. and Bradski, G. (2011) ORB: An Efficient Alternative to SIFT or SURF. 2011 International Conference on Computer Vision, Barcelona, 6-13 November 2011, 2564-2571. https://doi.org/10.1109/iccv.2011.6126544 |