|
TFIE-Gait:一种基于时频信息增强的步态识别模型
|
Abstract:
在步态识别任务中,空间和时序信息对区分不同步态模式至关重要。然而,现有方法在开放环境数据集(如Grew)中主要依赖空间信息,未充分利用时序信息,且开放环境数据集中的噪声(如遮挡和运动暂停)会破坏步态序列时序信息,干扰时序特征提取,降低模型性能。为此,本文提出TFIE-Gait模型,引入时频信息增强模块(TFIE)和去噪采样模块(DAS)。TFIE模块结合时域和频域信息,通过多尺度卷积和自注意力机制提取关节时序特征及关节间依赖关系,并利用傅里叶变换在频域提取判别性特征。DAS模块利用频域去噪前后的序列数据差异联合分析,识别和去除异常数据帧,并利用交叉相关算法拼接子序列,恢复步态序列的周期性时序信息。实验表明,TFIE-Gait在开放环境数据集上显著优于基线模型。
In gait recognition tasks, spatial and temporal information are crucial for distinguishing different gait patterns. However, existing methods primarily rely on spatial information in open-environment datasets (e.g., GREW) and fail to fully utilize temporal information. Moreover, noise in open-environment datasets (e.g., occlusions and motion pauses) can disrupt the temporal information of gait sequences, interfere with temporal feature extraction, and degrade model performance. To address these issues, this paper proposes the TFIE-Gait model, which introduces a Time-Frequency Information Enhancement (TFIE) module and a Denoising and Sampling (DAS) module. The TFIE module integrates time-domain and frequency-domain information, leveraging multi-scale convolution and self-attention mechanisms to extract joint temporal features and inter-joint dependencies, while utilizing Fourier transform to extract discriminative features in the frequency domain. The DAS module jointly analyzes the differences between sequences before and after frequency-domain denoising to identify and remove abnormal data frames, and employs a cross-correlation algorithm to stitch subsequences, thereby restoring the periodic temporal information of gait sequences. Experimental results demonstrate that TFIE-Gait significantly outperforms baseline models on open-environment datasets.
[1] | Connor, P. and Ross, A. (2018) Biometric Recognition by Gait: A Survey of Modalities and Features. Computer Vision and Image Understanding, 167, 1-27. https://doi.org/10.1016/j.cviu.2018.01.007 |
[2] | Cao, Z., Simon, T., Wei, S. and Sheikh, Y. (2017) Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 1302-1310. https://doi.org/10.1109/cvpr.2017.143 |
[3] | Martinez, J., Hossain, R., Romero, J. and Little, J.J. (2017) A Simple Yet Effective Baseline for 3d Human Pose Estimation. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2659-2668. https://doi.org/10.1109/iccv.2017.288 |
[4] | Loper, M., Mahmood, N., Romero, J., Pons-Moll, G. and Black, M.J. (2023) SMPL: A Skinned Multi-Person Linear Model. In: Whitton, M.C., Ed., Seminal Graphics Papers: Pushing the Boundaries, Volume 2, ACM, 851-866. https://doi.org/10.1145/3596711.3596800 |
[5] | Yu, S., Tan, D. and Tan, T. (2006) A Framework for Evaluating the Effect of View Angle, Clothing and Carrying Condition on Gait Recognition. 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong SAR, 20-24 August 2006, 441-444. |
[6] | Takemura, N., Makihara, Y., Muramatsu, D., Echigo, T. and Yagi, Y. (2018) Multi-View Large Population Gait Dataset and Its Performance Evaluation for Cross-View Gait Recognition. IPSJ Transactions on Computer Vision and Applications, 10, Article No. 4. https://doi.org/10.1186/s41074-018-0039-6 |
[7] | Zhu, Z., Guo, X., Yang, T., et al. (2021) Gait Recognition in the Wild: A Benchmark. Proceedings of the IEEE/CVF International Conference on Computer Vision 2021, Montreal, 11-17 October 2021, 14789-14799. |
[8] | Zheng, J., Liu, X., Liu, W., He, L., Yan, C. and Mei, T. (2022) Gait Recognition in the Wild with Dense 3D Representations and a Benchmark. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 20196-20205. https://doi.org/10.1109/cvpr52688.2022.01959 |
[9] | Chao, H., He, Y., Zhang, J. and Feng, J. (2019) GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8126-8133. https://doi.org/10.1609/aaai.v33i01.33018126 |
[10] | Fan, C., Hou, S., Huang, Y., et al. (2023) Exploring Deep Models for Practical Gait Recognition. arXiv: 2303.03301. |
[11] | Lin, B., Zhang, S. and Yu, X. (2021) Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 14648-14656. https://doi.org/10.1109/iccv48922.2021.01438 |
[12] | Wang, Q., Zhang, K. and Asghar, M.A. (2022) Skeleton-based ST-GCN for Human Action Recognition with Extended Skeleton Graph and Partitioning Strategy. IEEE Access, 10, 41403-41410. https://doi.org/10.1109/access.2022.3164711 |
[13] | Lei, T., Li, J. and Yang, K. (2024) Time and Frequency-Domain Feature Fusion Network for Multivariate Time Series Classification. Expert Systems with Applications, 252, Article ID: 124155. https://doi.org/10.1016/j.eswa.2024.124155 |
[14] | He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778. https://doi.org/10.1109/cvpr.2016.90 |
[15] | Teepe, T., Gilg, J., Herzog, F., Hormann, S. and Rigoll, G. (2022) Towards a Deeper Understanding of Skeleton-Based Gait Recognition. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, 19-20 June 2022, 1568-1576. https://doi.org/10.1109/cvprw56347.2022.00163 |
[16] | Fan, C., Peng, Y., Cao, C., Liu, X., Hou, S., Chi, J., et al. (2020) GaitPart: Temporal Part-Based Model for Gait Recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 14213-14221. https://doi.org/10.1109/cvpr42600.2020.01423 |
[17] | Huang, Z., Xue, D., Shen, X., Tian, X., Li, H., Huang, J., et al. (2021) 3D Local Convolutional Neural Networks for Gait Recognition. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 14900-14909. https://doi.org/10.1109/iccv48922.2021.01465 |
[18] | Liao, R., Cao, C., Garcia, E.B., Yu, S. and Huang, Y. (2017) Pose-Based Temporal-Spatial Network (PTSN) for Gait Recognition with Carrying and Clothing Variations. In: Zhou, J., et al., Eds., Biometric Recognition, Springer, 474-483. https://doi.org/10.1007/978-3-319-69923-3_51 |
[19] | Cooley, J.W. and Tukey, J.W. (1965) An Algorithm for the Machine Calculation of Complex Fourier Series. Mathematics of Computation, 19, 297-301. https://doi.org/10.1090/s0025-5718-1965-0178586-1 |
[20] | Gao, S., Yun, J., Zhao, Y. and Liu, L. (2021) Gait‐D: Skeleton‐Based Gait Feature Decomposition for Gait Recognition. IET Computer Vision, 16, 111-125. https://doi.org/10.1049/cvi2.12070 |
[21] | Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv: 2010.11929. |
[22] | Oppenheim, A.V. (1999) Discrete-Time Signal Processing. Pearson Education India. |
[23] | Fan, C., Liang, J., Shen, C., Hou, S., Huang, Y. and Yu, S. (2023) OpenGait: Revisiting Gait Recognition toward Better Practicality. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 9707-9716. https://doi.org/10.1109/cvpr52729.2023.00936 |