Predicting Categories and Ingredients of Traditional Dishes Using Deep Learning and Cross-Attention Mechanism

doi:10.4236/oalib.1112846

OALib Journal期刊
ISSN: 2333-9721
费用：99美元

查看量	下载量

Open Access Library Journal 12 2025

查看所有领域

Predicting Categories and Ingredients of Traditional Dishes Using Deep Learning and Cross-Attention Mechanism

DOI: 10.4236/oalib.1112846, PP. 1-12

Ima Sokolo,Chidiebere Ugwu,Friday E. Onuodu

Subject Areas: Complex network models

Keywords: Traditional Dishes, Cross-Attention Mechanism, Convolutional Neural Network

Full-Text Cite this paper Add to My Lib

Abstract

The food recognition systems are available for foreign dishes but much work has not been done for our traditional dishes making it difficult to classify the traditional dishes and the ingredients they are made up of. From extensive literature reviews conducted, the existing models on food recognition are not robust enough to handle classification and identification of ingredients in traditional dishes. This study developed an improved food recognition system for the classification and identification of ingredients in traditional dishes. The food image dataset used to build the model was gotten from Kaggle, which was not standardized. It was preprocessed and standardized for consistency across datasets in eighteen different classes for model building. The standardized dataset was split into two; 80% for training and 20% for testing, convolutional neural network and cross attention mechanism were used to build the model. The cross-attention mechanism was used to selectively pick features across the multiple classes in the food dataset. ReLU was used as activation function and Adam optimizer was used as optimization function in building the model. The object oriented analysis methodology was used in the design, while python programing language was used in the development of the system. The result obtained shows an accuracy of 93.57% for training and 90.0% for validation and error loss of 0.062% and 0.001% respectively and interestingly, during testing the model gave 99% accuracies to traditional food images inputted on it. The results from application were able to detect and classify traditional dishes into different classes and outline the ingredients used to prepare them which shows tremendously performance of the system.

Cite this paper

Sokolo, I. , Ugwu, C. and Onuodu, F. E. (2025). Predicting Categories and Ingredients of Traditional Dishes Using Deep Learning and Cross-Attention Mechanism. Open Access Library Journal, 12, e2846. doi: http://dx.doi.org/10.4236/oalib.1112846.

References

[1]	Zhou, L., Zhang, C., Liu, F., Qiu, Z. and He, Y. (2019) Application of Deep Learning in Food: A Review. Comprehensive Reviews in Food Science and Food Safety, 18, 1793-1811. https://doi.org/10.1111/1541-4337.12492
[2]	Mezgec, S. and Koroušić Seljak, B. (2017) Nutrinet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment. Nutrients, 9, Article 657. https://doi.org/10.3390/nu9070657
[3]	Prajena, G., Harefa, J., Alexander, Josephus, B.O. and Nawir, A.H. (2022) Indonesian Traditional Food Image Recognition Using Convolutional Neural Network. 2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), Jakarta, 16-17 November 2022, 142-147. https://doi.org/10.1109/icimcis56303.2022.10017684
[4]	Khan, M.A., Rushe, E., Smyth, B. and Coyle, D. (2019) Personalized, Health-Aware Recipe Recommendation: An Ensemble Topic Modeling Based Approach. CEUR Workshop Proceedings, Copenhagen, 20 September 2019, 2439.
[5]	Yera Toledo, R., Alzahrani, A.A. and Martinez, L. (2019) A Food Recommender System Considering Nutritional Information and User Preferences. IEEE Access, 7, 96695-96711. https://doi.org/10.1109/access.2019.2929413
[6]	Chai, J., Zeng, H., Li, A. and Ngai, E.W.T. (2021) Deep Learning in Computer Vision: A Critical Review of Emerging Techniques and Application Scenarios. Machine Learning with Applications, 6, Article ID: 100134. https://doi.org/10.1016/j.mlwa.2021.100134
[7]	Liu, Y., Pu, H. and Sun, D. (2021) Efficient Ex-traction of Deep Image Features Using Convolutional Neural Network (CNN) for Applications in Detecting and Analysing Complex Food Matrices. Trends in Food Science & Technology, 113, 193-204. https://doi.org/10.1016/j.tifs.2021.04.042
[8]	Wang, H., Lin, G., Hoi, S.C.H. and Miao, C. (2022) Learning Structural Representations for Recipe Generation and Food Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 3363-3377. https://doi.org/10.1109/tpami.2022.3181294
[9]	Li, D. and Zaki, M.J. (2020) RECIPTOR: An Ef-fective Pretrained Model for Recipe Representation Learning. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 6-10 July 2020, 1719-1727. https://doi.org/10.1145/3394486.3403223
[10]	Hu, D. (2019) An Introductory Survey on Attention Mechanisms in NLP Problems. In: Bi, Y., Bhatia, R. and Kapoor, S., Eds., Intelligent Systems and Appli-cations, Springer, 432-448. https://doi.org/10.1007/978-3-030-29513-4_31
[11]	Yagcioglu, S., Erdem, A., Erdem, E. and Ikizler-Cinbis, N. (2018) RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 31 October-4 November 2018., Brussels, 1358-1368. https://doi.org/10.18653/v1/d18-1166
[12]	Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., et al. (2017) Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 3068-3076. https://doi.org/10.1109/cvpr.2017.327
[13]	Chen, J., Zheng, Y., Jiang, Z. and Lin, Z. (2020) Ingredient Recognition for Cooking Recipes Using Deep Learning. Pattern Recognition Letters, 131, 194-200.
[14]	Min, W., Jiang, S., Liu, L., Rui, Y. and Jain, R. (2019) A Survey on Food Computing. ACM Computing Surveys, 52, 1-36. https://doi.org/10.1145/3329168
[15]	Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R., Bengio, Y., et al. (2015) Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. Proceedings of the 32nd International Conference on Machine Learning, Lille, 6-11 July 2015, 2048-2057.
[16]	Marin, J., Escalante, H.J., Hernández, C.A., Gonzalez, J.A., Lopez-Lopez, A., Sucar, L.E., Guyon, I., et al. (2019) Recipe1M: A Dataset for Learning Cross-Modal Embed-dings for Cooking Recipes and Food Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41, 1352-1362.
[17]	Wang, J., Zhou, M., Chen, Q. and Jiang, X. (2020) MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Cooking Recipe Retrieval and Exploration. arXiv: 2007.16113.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133