|
通过轻量级人体动作识别实现老年人安全实时监控
|
Abstract:
本文旨在探讨如何通过轻量级人体动作识别技术实现老年人安全实时监控。老年人安全问题一直是一个备受关注的问题,尤其是在现代化社会,许多老年人居住在独自生活的环境中,缺乏及时的照顾和照料。为了及时监控老年人的安全情况,本文提出了一种面向老年人实时监控系统的轻量级人体动作识别算法。本文使用卷积神经网络与具有长短时记忆的循环神经网络的组合架构,结合了卷积神经网络在图像特征提取过程中的优越性能,以及长短时记忆神经网络对时序数据处理过程的特点,并进行了详细的实验验证,实验结果表明本文提出的轻量级人体动作识别算法具有显著的优势。
The purpose of this paper is to explore how to achieve real-time monitoring of elderly safety through lightweight human motion recognition technology. The safety of the elderly has always been a matter of great concern, especially in modern society where many elderly people live in solitary environments and lack timely care and attention. In order to monitor the safety of the elderly in a timely manner, this paper proposes a lightweight human action recognition algorithm for a real-time monitoring system for the elderly. This paper uses a combined architecture of convolutional neural network and recurrent neural network with long and short term memory, combines the superior performance of convolutional neural network in image feature extraction process and the characteristics of long and short term memory neural network for temporal data processing process, and conducts detailed experimental validation, the experimental results show that the lightweight human action recognition algorithm proposed in this paper has significant advantages.
[1] | 潘泽瀚, 吴连霞, 卓冲, 等. 2010-2020年中国老年人口健康水平空间格局演变及其影响因素[J]. 地理学报, 2022, 77(12): 3072-3089. |
[2] | 冉宪宇. 基于图像处理技术的智能化人体行为识别模型研究[J]. 微型电脑应用, 2022, 38(10): 175-178. |
[3] | 穆光宗, 张团. 我国人口老龄化的发展趋势及其战略应对[J]. 华中师范大学学报(人文社会科学版), 2011, 50(5): 29-36. |
[4] | 张文范. 我国人口老龄化与战略性选择[J]. 城市规划, 2001, 26(2): 68-72. |
[5] | 李姝婧, 翟振武. 人口老龄化对中国产业结构演进的影响[J]. 人口学刊, 2022, 44(6): 38-52. |
[6] | 倪宣明, 贺英洁, 武康平, 等. 人口老龄化, 移民与经济增长[J]. 系统工程理论与实践, 2022, 42(1): 1-12. |
[7] | Turaga, P., Chellappa, R., Subrahmanian, V.S. and Udrea, O. (2008) Machine Recognition of Human Activities: A Survey. IEEE Transactions on Circuits and Systems for Video Technology, 18, 1473-1488.
https://doi.org/10.1109/TCSVT.2008.2005594 |
[8] | Poppe, R. (2010) A Survey on Vision-Based Human Action Recognition. Image and Vision Computing, 28, 976-990.
https://doi.org/10.1016/j.imavis.2009.11.014 |
[9] | Weinland, D., Ronfard, R. and Boyer, E. (2011) A Survey of Vision-Based Methods for Action Representation, Segmentation and Recognition. Computer Vision and Image Understanding, 115, 224-241.
https://doi.org/10.1016/j.cviu.2010.10.002 |
[10] | Popoola, O.P. and Wang, K. (2012) Video-Based Abnormal Human Behavior Recognition-A Review. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42, 865-878.
https://doi.org/10.1109/TSMCC.2011.2178594 |
[11] | Ke, S.R., Thuc, H.L.U., Lee, Y.J., et al. (2013) A Review on Video-Based Human Activity Recognition. Computers, 2, 88-131. https://doi.org/10.3390/computers2020088 |
[12] | Aggarwal, J.K. and Xia, L. (2014) Human Activity Recognition from 3D Data: A Review. Pattern Recognition Letters, 48, 70-80. https://doi.org/10.1016/j.patrec.2014.04.011 |
[13] | Zhang, Z. (2012) Microsoft Kinect Sensor and Its Effect. IEEE Multimedia, 19, 4-10.
https://doi.org/10.1109/MMUL.2012.24 |
[14] | Vrigkas, M., Nikou, C. and Kakadiaris, I.A. (2015) A Review of Human Activity Recognition Methods. Frontiers in Robotics and AI, 2, Article 28. https://doi.org/10.3389/frobt.2015.00028 |
[15] | Subetha, T. and Chitrakala, S. (2016) A Survey on Human Activity Recognition from Videos. Proceedings of 2016 International Conference on Information Communication and Embedded Systems (ICICES), Chennai, 25-26 February 2016, 1-7. https://doi.org/10.1109/ICICES.2016.7518920 |
[16] | Presti, L.L. and La Cascia, M. (2016) 3D Skeleton-Based Human Action Classification: A Survey. Pattern Recognition, 53, 130-147. https://doi.org/10.1016/j.patcog.2015.11.019 |
[17] | Dalal, N. and Triggs, B. (2005) Histograms of Oriented Gradients for Human Detection. Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Diego, 20-25 June 2005, 886-893. |
[18] | Laptev, I., Marszalek, M., Schmid, C. and Rozenfeld, B. (2008) Learning Realistic Human Actions from Movies. Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, 23-28 June 2008, 1-8. https://doi.org/10.1109/CVPR.2008.4587756 |
[19] | Lowe, D.G. (1999) Object Recognition from Local Scale-Invariant Features. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, 20-27 September 1999, 1150-1157.
https://doi.org/10.1109/ICCV.1999.790410 |
[20] | Bay, H., Tuytelaars, T. and Van Gool, L. (2006) Surf: Speeded up Robust Features. In: Leonardis, A., Bischof, H. and Pinz, A., Eds., ECCV 2006: Computer Vision-ECCV 2006, Lecture Notes in Computer Science, Vol. 3951, Springer, Berlin, 404-417. https://doi.org/10.1007/11744023_32 |
[21] | Dollár, P., Rabaud, V., Cottrell, G., et al. (2005) Behavior Recognition via sparse Spatio-Temporal Features. Proceedings of 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Beijing, 15-16 October 2005, 65-72. |
[22] | Klaser, A., Marsza?ek, M. and Schmid, C. (2008) A Spatio-Temporal Descriptor Based on 3d-Gradients. Proceedings of BMVC 2008-19th British Machine Vision Conference, Leeds, September 2008, 99.1-99.10.
https://doi.org/10.5244/C.22.99 |
[23] | Zaremba, W., Sutskever, I. and Vinyals, O. (2014) Recurrent Neural Network Regularization. (Preprint) |
[24] | Bengio, Y., Simard, P. and Frasconi, P. (1994) Learning Long-Term Dependencies with Gradient Descent Is Difficult. IEEE Transactions on Neural Networks, 5, 157-166. https://doi.org/10.1109/72.279181 |
[25] | Graves, A. and Graves, A. (2012) Long Short-Term Memory. In: Graves, A., Ed., Supervised Sequence Labelling with Recurrent Neural Networks, Studies in Computational Intelligence, Vol. 385, Springer, Berlin, 37-45.
https://doi.org/10.1007/978-3-642-24797-2_4 |
[26] | Misgar, M.M., Mushtaq, F., Khurana, S.S. and Kumar, M. (2023) Recognition of Offline Handwritten Urdu Characters Using RNN and LSTM Models. Multimedia Tools and Applications, 82, 2053-2076.
https://doi.org/10.1007/s11042-022-13320-1 |
[27] | Yang, G., Yang, Y., Lu, Z., et al. (2022) STA-TSN: Spatial-Temporal Attention Temporal Segment Network for Action Recognition in Video. PLOS ONE, 17, e0265115. https://doi.org/10.1371/journal.pone.0265115 |
[28] | Chen, J. and Ho, C.M. (2022) MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, 3-8 January 2022, 1910-1921. https://doi.org/10.1109/WACV56688.2023.00333 |
[29] | Ahn, D., Kim, S., Hong, H., et al. (2023) STAR-Transformer: A Spatio-Temporal Cross Attention Transformer for Human Action Recognition. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, 2-7 January 2023, 3319-3328. https://doi.org/10.1109/WACV56688.2023.00333 |
[30] | Soomro, K., Zamir, A.R. and Shah, M. (2012) UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild. (Preprint) |