|
基于CNN-BiGRU的足球视频片段分类方法
|
Abstract:
基于深度学习的视频分类是体育视频研究的一个重要方向。针对目前视频事件类型识别率低的问题,本文提出了一种基于CNN-BiGRU网络的足球视频事件分类方法。该方法首先利用PySceneDetect工具的场景切换检测功能对完整足球视频进行镜头分割,在此基础上构建包含五类足球事件的数据集;随后通过实验对比,选择将目前主流的卷积神经网络VGG16与BiGRU结合构建分类模型。实验结果表明,CNN与RNN的结合,解决了视频中时间维度利用不足的问题,更有效的整合足球视频中时间维度和空间维度的动态信息,实现比传统技术更高的精度和更快的速度。目前该模型对足球视频数据集上的某单一事件识别率最高达到97.4%。
Video classification based on deep learning is an important direction of sports video research. Aiming at the problem of low recognition rate of video event types, this paper proposes a football video event classification method based on CNN-BiGRU network. It first uses the scene switching detection function of PySceneDetect tool to segment the complete football video, and builds a data set containing five types of football events on this basis, then, through experimental comparisons, combine the current mainstream convolutional neural network VGG16 with BiGRU to construct a classification model. The experimental results show that the combination of CNN and RNN solves the problem of insufficient utilization of the time dimension of videos, more effectively integrates the dynamic information of two dimensions of time and space in football videos, and achieves higher accuracy and faster speed than traditional technologies. At present, the model has a maximum recognition rate of 97.4% for a single event on the football video dataset.
[1] | Doman, K., Tomita, T., Ide, I., Deguchi, D. and Murase, H. (2014) Event Detection Based on Twitter Enthusiasm Degree for Generating a Sports Highlight Video. Proceedings of the 22nd ACM International Conference on Multimedia, Or-lando, 3-7 November 2014, 949-952. https://doi.org/10.1145/2647868.2654973 |
[2] | Kolekar, M.H. and Sengupta, S. (2015) Bayesian Network-Based Customized Highlight Generation for Broadcast Soccer Videos. IEEE Transactions on Broadcasting, 61, 195-209. https://doi.org/10.1109/TBC.2015.2424011 |
[3] | Arbat, S., Sinha, S.K. and Shikha, B.K. (2014) Event Detection in Broadcast Soccer Video by Detecting Replays. International Journal of Scientific &Technology Research, 3, 282-285. |
[4] | Naveed, H., Khan, G., Khan, A.U., Siddiqi, S. and Khan, M.U.G. (2019) Human Activity Recognition Using Mixture of Heterogeneous Features and Sequential Minimal Optimization. Interna-tional Journal of Machine Learning and Cybernetics, 10, 2329-2340. https://doi.org/10.1007/s13042-018-0870-1 |
[5] | Pandya, D.S. and Zaveri, M.A. (2017) Frame Based Approach for Automatic Event Boundary Detection of Soccer Video Using Optical Flow. 2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), Kuching, 12-14 September 2017, 402-406. https://doi.org/10.1109/ICSIPA.2017.8120644 |
[6] | Ji, S., Xu, W., Yang, M. and Yu, K. (2012) 3D Convolutional Neural Networks for Human Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 221-231. https://doi.org/10.1109/TPAMI.2012.59 |
[7] | Song, C. and Rasmussen, C. (2019) Multi-Camera Tem-poral Grouping for Play/Break Event Detection in Soccer Games. In: Bebis, G., et al., Eds., Advances in Visual Compu-ting. ISVC 2019. Lecture Notes in Computer Science, Vol. 11844, Springer, Cham, 231-243. https://doi.org/10.1007/978-3-030-33720-9_18 |
[8] | 程萍. 基于多模态融合的足球视频精彩事件检测[D]: [硕士学位论文]. 杭州: 浙江理工大学, 2020.
https://doi.org/10.27786/d.cnki.gzjlg.2020.000217 |
[9] | Lea, C., Flynn, M.D., Vidal, R., Reiter, A. and Hager, G.D. (2017) Temporal Convolutional Networks for Action Segmentation and Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 156-165. https://doi.org/10.1109/CVPR.2017.113 |
[10] | Shou, Z., Chan, J., Zareian, A., Miyazawa, K. and Chang, S -F. (2017) CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 5734-5743. https://doi.org/10.1109/CVPR.2017.155 |
[11] | Khan, M.Z., Saleem, S., Hassan, M.A. and Usman Ghanni Khan, M. (2018) Learning Deep C3D Features for Soccer Video Event Detection. 2018 14th International Conference on Emerg-ing Technologies (ICET), Islamabad, 21-22 November 2018, 1-6. https://doi.org/10.1109/ICET.2018.8603644 |
[12] | Rongved, O.A.N., Hicks, S.A., Thambawita, V., et al. (2020) Real-Time Detection of Events in Soccer Videos using 3D Convolutional Neural Networks. 2020 IEEE International Symposium on Multimedia (ISM), Naples, 2-4 December 2020, 135-144. https://doi.org/10.1109/ISM.2020.00030 |
[13] | Vanderplaetse, B. and Dupont, S. (2020) Improved Soccer Action Spotting using both Audio and Video Streams. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recogni-tion Workshops (CVPRW), Seattle, 14-19 June 2020, 896-897. https://doi.org/10.1109/CVPRW50498.2020.00456 |
[14] | Ren, L., Sun, Y., Wang, H. and Zhang, L. (2018) Predic-tion of Bearing Remaining Useful Life with Deep Convolution Neural Network. IEEE Access, 6, 13041-13049. https://doi.org/10.1109/ACCESS.2018.2804930 |
[15] | Ramanathan, V., Huang, J., Abu-El-Haija, S., et al. (2016) Detecting Events and Key Actors in Multi-Person Videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 3043-3053.
https://doi.org/10.1109/CVPR.2016.332 |