The food recognition systems are available for foreign dishes but much work has not been done for our traditional dishes making it difficult to classify the traditional dishes and the ingredients they are made up of. From extensive literature reviews conducted, the existing models on food recognition are not robust enough to handle classification and identification of ingredients in traditional dishes. This study developed an improved food recognition system for the classification and identification of ingredients in traditional dishes. The food image dataset used to build the model was gotten from Kaggle, which was not standardized. It was preprocessed and standardized for consistency across datasets in eighteen different classes for model building. The standardized dataset was split into two; 80% for training and 20% for testing, convolutional neural network and cross attention mechanism were used to build the model. The cross-attention mechanism was used to selectively pick features across the multiple classes in the food dataset. ReLU was used as activation function and Adam optimizer was used as optimization function in building the model. The object oriented analysis methodology was used in the design, while python programing language was used in the development of the system. The result obtained shows an accuracy of 93.57% for training and 90.0% for validation and error loss of 0.062% and 0.001% respectively and interestingly, during testing the model gave 99% accuracies to traditional food images inputted on it. The results from application were able to detect and classify traditional dishes into different classes and outline the ingredients used to prepare them which shows tremendously performance of the system.
Cite this paper
Sokolo, I. , Ugwu, C. and Onuodu, F. E. (2025). Predicting Categories and Ingredients of Traditional Dishes Using Deep Learning and Cross-Attention Mechanism. Open Access Library Journal, 12, e2846. doi: http://dx.doi.org/10.4236/oalib.1112846.
Zhou, L., Zhang, C., Liu, F., Qiu, Z. and He, Y. (2019) Application of Deep Learning in Food: A Review. Comprehensive Reviews in Food Science and Food Safety, 18, 1793-1811. https://doi.org/10.1111/1541-4337.12492
Mezgec, S. and Koroušić Seljak, B. (2017) Nutrinet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment. Nutrients, 9, Article 657. https://doi.org/10.3390/nu9070657
Prajena, G., Harefa, J., Alexander, Josephus, B.O. and Nawir, A.H. (2022) Indonesian Traditional Food Image Recognition Using Convolutional Neural Network. 2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), Jakarta, 16-17 November 2022, 142-147. https://doi.org/10.1109/icimcis56303.2022.10017684
Khan, M.A., Rushe, E., Smyth, B. and Coyle, D. (2019) Personalized, Health-Aware Recipe Recommendation: An Ensemble Topic Modeling Based Approach. CEUR Workshop Proceedings, Copenhagen, 20 September 2019, 2439.
Yera Toledo, R., Alzahrani, A.A. and Martinez, L. (2019) A Food Recommender System Considering Nutritional Information and User Preferences. IEEE Access, 7, 96695-96711. https://doi.org/10.1109/access.2019.2929413
Chai, J., Zeng, H., Li, A. and Ngai, E.W.T. (2021) Deep Learning in Computer Vision: A Critical Review of Emerging Techniques and Application Scenarios. Machine Learning with Applications, 6, Article ID: 100134. https://doi.org/10.1016/j.mlwa.2021.100134
Liu, Y., Pu, H. and Sun, D. (2021) Efficient Ex-traction of Deep Image Features Using Convolutional Neural Network (CNN) for Applications in Detecting and Analysing Complex Food Matrices. Trends in Food Science & Technology, 113, 193-204. https://doi.org/10.1016/j.tifs.2021.04.042
Wang, H., Lin, G., Hoi, S.C.H. and Miao, C. (2022) Learning Structural Representations for Recipe Generation and Food Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 3363-3377. https://doi.org/10.1109/tpami.2022.3181294
Li, D. and Zaki, M.J. (2020) RECIPTOR: An Ef-fective Pretrained Model for Recipe Representation Learning. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 6-10 July 2020, 1719-1727. https://doi.org/10.1145/3394486.3403223
Hu, D. (2019) An Introductory Survey on Attention Mechanisms in NLP Problems. In: Bi, Y., Bhatia, R. and Kapoor, S., Eds., Intelligent Systems and Appli-cations, Springer, 432-448. https://doi.org/10.1007/978-3-030-29513-4_31
Yagcioglu, S., Erdem, A., Erdem, E. and Ikizler-Cinbis, N. (2018) RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 31 October-4 November 2018., Brussels, 1358-1368. https://doi.org/10.18653/v1/d18-1166
Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., et al. (2017) Learning Cross-Modal Embeddings for Cooking Recipes and Food Images. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 3068-3076. https://doi.org/10.1109/cvpr.2017.327
Chen, J., Zheng, Y., Jiang, Z. and Lin, Z. (2020) Ingredient Recognition for Cooking Recipes Using Deep Learning. Pattern Recognition Letters, 131, 194-200.
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R., Bengio, Y., et al. (2015) Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. Proceedings of the 32nd International Conference on Machine Learning, Lille, 6-11 July 2015, 2048-2057.
Marin, J., Escalante, H.J., Hernández, C.A., Gonzalez, J.A., Lopez-Lopez, A., Sucar, L.E., Guyon, I., et al. (2019) Recipe1M: A Dataset for Learning Cross-Modal Embed-dings for Cooking Recipes and Food Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41, 1352-1362.
Wang, J., Zhou, M., Chen, Q. and Jiang, X. (2020) MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Cooking Recipe Retrieval and Exploration. arXiv: 2007.16113.