%0 Journal Article
%T 基于深度学习的移动端水果识别<br>Mobile Fruit Recognition Based on Deep Learning
%A 郭健
%A 吴薇
%J Embedded Technoloy and Intelligent Systems
%P 64-76
%D 2024
%I Hans Publishing
%R 10.12677/etis.2024.12008
%X 超市水果识别主要依赖人工&#65292;计算机视觉成为一种解决方案。然而目前仍面临部分水果识别精度低、终端设备部署困难、误识别图片难处理等挑战。因此&#65292;文章基于深度学习对移动端水果识别进行研究&#65292;旨在替代人工识别。首先文章构建了包含49种水果的超市水果图像数据集DailyFruit-49。并针对细分类特征相似度高、包装遮挡、形状小量少的水果识别困难&#65292;以及低算力设备模型部署问题&#65292;筛选了满足部署要求的骨干模型。设计了新的注意力模块RMA&#65292;改进了ViT Block以增强模型的细节识别能力和深层语义特征整合能力&#65292;最终得到DenseRMA_ViT模型&#65292;并基于Focal Loss改进损失函数。并在公开数据集Fruits-262上进行消融实验验证模型改进的有效性。最后结合实际设备&#65292;实现水果识别系统&#65292;满足实际使用。基于与用户的交互行为对误识别水果图像进行收集&#65292;并基于误识别图像实现模型权重自动微调&#65292;随使用时间延长&#65292;系统收集更多图片&#65292;提升模型识别精度与泛化能力&#65292;以处理实际应用中误识别水果。<br>Supermarket fruit recognition mainly relies on manual processes, and computer vision has emerged as a solution. However, challenges remain, including low accuracy for some fruits, difficulties in deploying them on terminal devices, and handling misidentified images. Therefore, this paper researches mobile fruit recognition based on deep learning, aiming to replace manual identification. First, the paper constructs the DailyFruit-49 dataset, which includes images of 49 types of fruits. Addressing the challenges of recognizing fruits with high feature similarity, packaging obstructions, and small shapes, as well as the deployment issues on low-compute devices, the backbone model meeting deployment requirements was selected. A new attention module, RMA, was designed, and the ViT Block was improved to enhance the model&#8217;s detail recognition and deep semantic feature integration capabilities, resulting in the Dense RMA_ViT model. The loss function was also improved based on Focal Loss. Ablation experiments on the public dataset Fruits-262 verified the effectiveness of these improvements. Finally, a fruit recognition system was implemented on actual devices to meet practical usage needs. The system collects misidentified fruit images based on user interactions and automatically fine-tunes the model&#8217;s weights based on these images. Over time, as the system collects more images, the model&#8217;s recognition accuracy and generalization ability improve, effectively handling misidentified fruits in real-world applications.
%K 水果识别&#65292
%K 数据集构建&#65292
%K 改进注意力机制&#65292
%K ViT&#65292
%K 系统设计&#65292
%K 模型权重自更新<br>Fruit Recognition
%K Dataset Construction
%K Improved Attention Mechanism
%K ViT
%K System Design
%K Model Weight Self-Updating
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=101680