全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

一种基于Lifelogging视频的文本标签生成模型
A Text Label Generation Model Based on Lifelogging Videos

DOI: 10.12677/csa.2025.151008, PP. 71-84

Keywords: 生活日志,视频关键帧选取,视频检索,视频标签生成
Lifelogging
, Video Keyframe Selection, Video Retrieval, Video Tagging

Full-Text   Cite this paper   Add to My Lib

Abstract:

从2011年开始,我们发起了一个收集个人信息生活记录数据的项目,该项目收集了22位志愿者的4万条lifelogging数据。随着时间的推移,志愿者的lifelogging数据越来越多,其中收集到的视频就多达3020条,想要搜索这些lifelogging数据中的视频变得非常困难。因此,我们提出了一种视频分解 + 图像分析的模型Liu-VTM (Video Tags Model),该模型从lifelogging视频中筛选能够代表该视频内容的关键帧,并依据关键帧进行图像识别得到视频的标签,最后可以通过标签直接检索到相应的视频。在本次实验中我们探讨了多种视频选取关键帧的方法对模型的影响,并提出了一个新的评价指标“最佳内容覆盖率”用于评价lifelog领域内视频选取到的关键帧的性能。我们的实验结果证明了Liu-VTM模型可以有效对lifelogging数据集打上视频标签并依据标签直接检索到相应视频。
Since 2011, we have initiated a project to collect personal lifelogging data, gathering 40,000 lifelogging entries from 22 volunteers. Over time, the amount of lifelogging data from the volunteers has increased, including as many as 3020 videos, making it extremely difficult to search through these lifelogging videos. Therefore, we propose a video decomposition and image analysis model called Liu-VTM (Video Tags Model). This model selects keyframes from lifelogging videos that represent the content of the video and uses image recognition on these keyframes to generate video tags. These tags can then be used to directly retrieve the corresponding videos. In this experiment, we explored various methods for selecting keyframes from videos and proposed a new evaluation metric called “Optimal Content Coverage Rate” to assess the performance of keyframe selection in the lifelogging domain. Our experimental results demonstrate that the Liu-VTM model can effectively tag videos in lifelogging datasets and retrieve the corresponding videos based on these tags.

References

[1]  Gurrin, C., Smeaton, A.F. and Doherty, A.R. (2014) LifeLogging: Personal Big Data. Foundations and Trends® in Information Retrieval, 8, 1-125.
https://doi.org/10.1561/1500000033
[2]  Shen, Y., Guo, B., Shen, Y., Duan, X., Dong, X., Zhang, H., et al. (2022) Personal Big Data Pricing Method Based on Differential Privacy. Computers & Security, 113, Article ID: 102529.
https://doi.org/10.1016/j.cose.2021.102529
[3]  Andrew, A., Eustice, K. and Hickl, A. (2013) Using Location Lifelogs to Make Meaning of Food and Physical Activity Behaviors. Proceedings of the ICTs for Improving Patients Rehabilitation Research Techniques, 5-8 May 2013, Venice, 408-411.
https://doi.org/10.4108/icst.pervasivehealth.2013.252134
[4]  Bell, G. and Gemmell, J. (2007) A Digital Life. Scientific American, 296, 58-65.
https://doi.org/10.1038/scientificamerican0307-58
[5]  Jalal, A., Batool, M. and Kim, K. (2020) Sustainable Wearable System: Human Behavior Modeling for Life-Logging Activities Using K-Ary Tree Hashing Classifier. Sustainability, 12, Article No. 10324.
https://doi.org/10.3390/su122410324
[6]  Ribeiro, R., Trifan, A. and Neves, A.J.R. (2022) Lifelog Retrieval from Daily Digital Data: Narrative Review. JMIR mHealth and uHealth, 10, e30517.
https://doi.org/10.2196/30517
[7]  Zhou, L. and Gurrin, C. (2022) Multimodal Embedding for Lifelog Retrieval. In: Jónsson, B., et al., Eds., MultiMedia Modeling, Springer International Publishing, 416-427.
https://doi.org/10.1007/978-3-030-98358-1_33
[8]  Nguyen, T., Le, T., Ninh, V., Tran, M., Thanh Binh, N., Healy, G., et al. (2021) Lifeseeker 3.0: An Interactive Lifelog Search Engine for Lsc’21. Proceedings of the 4th Annual on Lifelog Search Challenge, Taipei, 21 August 2021, 41-46.
https://doi.org/10.1145/3463948.3469065
[9]  Tran, L., Kennedy, D., Zhou, L., Nguyen, B. and Gurrin, C. (2022) A Virtual Reality Reminiscence Interface for Personal Lifelogs. In: Jónsson, B., et al., Eds., MultiMedia Modeling, Springer International Publishing, 479-484.
https://doi.org/10.1007/978-3-030-98355-0_42
[10]  Ksibi, A., Alluhaidan, A.S.D., Salhi, A. and El-Rahman, S.A. (2021) Overview of Lifelogging: Current Challenges and Advances. IEEE Access, 9, 62630-62641.
https://doi.org/10.1109/access.2021.3073469
[11]  Liu, G., Rehman, M.U. and Wu, Y. (2021) Personal Trajectory Analysis Based on Informative Lifelogging. Multimedia Tools and Applications, 80, 22177-22191.
https://doi.org/10.1007/s11042-021-10755-w
[12]  Khan, I., Ali, S. and Khusro, S. (2019) Smartphone-Based Lifelogging: An Investigation of Data Volume Generation Strength of Smartphone Sensors. In: Song, H.B. and Jiang, D.D., Eds., Simulation Tools and Techniques, Springer International Publishing, 63-73.
https://doi.org/10.1007/978-3-030-32216-8_6
[13]  Ribeiro, R., Neves, A. and Oliveira, J.L. (2020) Image Selection Based on Low Level Properties for Lifelog Moment Retrieval. 12th International Conference on Machine Vision (ICMV 2019), Amsterdam, 16-18 November 2019, 9-18.
https://doi.org/10.1117/12.2557073
[14]  Xu, Q., Molino, A.G.D., Lin, J., Fang, F., Subbaraju, V., Li, L., et al. (2021) Lifelog Image Retrieval Based on Semantic Relevance Mapping. ACM Transactions on Multimedia Computing, Communications, and Applications, 17, 1-18.
https://doi.org/10.1145/3446209
[15]  Ali, S., Khusro, S., Khan, A. and Khan, H. (2021) Smartphone-Based Lifelogging: Toward Realization of Personal Big Data. In: Guarda, T., et al., Eds., Information and Knowledge in Internet of Things, Springer International Publishing, 249-309.
https://doi.org/10.1007/978-3-030-75123-4_12
[16]  Hodges, S., Williams, L., Berry, E., Izadi, S., Srinivasan, J., Butler, A., et al. (2006) Sensecam: A Retrospective Memory Aid. 8th International Conference, UbiComp 2006, Orange County, 17-21 September 2006, 177-193.
https://doi.org/10.1007/11853565_11
[17]  Harvey, M., Langheinrich, M. and Ward, G. (2016) Remembering through Lifelogging: A Survey of Human Memory Augmentation. Pervasive and Mobile Computing, 27, 14-26.
https://doi.org/10.1016/j.pmcj.2015.12.002
[18]  Byrne, D., Doherty, A.R., Snoek, C.G.M., Jones, G.J.F. and Smeaton, A.F. (2009) Everyday Concept Detection in Visual Lifelogs: Validation, Relationships and Trends. Multimedia Tools and Applications, 49, 119-144.
https://doi.org/10.1007/s11042-009-0403-8
[19]  Venugopalan, S., Rohrbach, M., Donahue, J., Mooney, R., Darrell, T. and Saenko, K. (2015) Sequence to Sequence—Video to Text. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 4534-4542.
https://doi.org/10.1109/iccv.2015.515
[20]  Yao, L., Torabi, A., Cho, K., Ballas, N., Pal, C., Larochelle, H., et al. (2015) Describing Videos by Exploiting Temporal Structure. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 4507-4515.
https://doi.org/10.1109/iccv.2015.512
[21]  Doherty, A.R. and Smeaton, A.F. (2008) Automatically Segmenting Lifelog Data into Events. 2008 9th International Workshop on Image Analysis for Multimedia Interactive Services, Klagenfurt, 7-9 May 2008, 20-23.
https://doi.org/10.1109/wiamis.2008.32
[22]  Gemmell, J., Bell, G. and Lueder, R. (2006) MyLifeBits: A Personal Database for Everything. Communications of the ACM, 49, 88-95.
https://doi.org/10.1145/1107458.1107460
[23]  Aizawa, K., Hori, T., Kawasaki, S. and Ishikawa, T. (2004) Capture and Efficient Retrieval of Life Log. Pervasive 2004 Workshop on Memory and Sharing Experiences, Vienna, 20 April 2004, 15-20.
[24]  Zhou, L., Hinbarji, Z., Dang-Nguyen, D. and Gurrin, C. (2018) Lifer: An Interactive Lifelog Retrieval System. Proceedings of the 2018 ACM Workshop on The Lifelog Search Challenge, Yokohama, 11 June 2018, 9-14.
https://doi.org/10.1145/3210539.3210542
[25]  Khan, U.A., Ejaz, N., Martinez-del-Amor, M.A. and Sparenberg, H. (2017) Movies Tags Extraction Using Deep Learning. 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, 29 August-1 September 2017, 1-6.
https://doi.org/10.1109/avss.2017.8078459
[26]  Gibson, D., Campbell, N. and Thomas, B. (2002) Visual Abstraction of Wildlife Footage Using Gaussian Mixture Models and the Minimum Description Length Criterion. 2002 International Conference on Pattern Recognition, Vol. 2, 814-817.
https://doi.org/10.1109/icpr.2002.1048427
[27]  Truong, B.T. and Venkatesh, S. (2007) Video Abstraction: A Systematic Review and Classification. ACM Transactions on Multimedia Computing, Communications, and Applications, 3, 3-es.
https://doi.org/10.1145/1198302.1198305
[28]  Lv, C. and Huang, Y. (2018) Effective Keyframe Extraction from Personal Video by Using Nearest Neighbor Clustering. 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Beijing, 13-15 October 2018, 1-4.
https://doi.org/10.1109/cisp-bmei.2018.8633207
[29]  Ilyas, S. and Ur Rehman, H. (2019) A Deep Learning Based Approach for Precise Video Tagging. 2019 15th International Conference on Emerging Technologies (ICET), Peshawar, 2-3 December 2019, 1-6.
https://doi.org/10.1109/icet48972.2019.8994567

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133