|
基于自注意力机制与特征融合的课堂学生表情识别模型
|
Abstract:
为解决通常课堂场景下学生人脸表情识别的遮挡问题,通过部分分割和随机遮挡策略将原图分割成多路人脸图像,采用相同的残差网络提取特征,借助自注意力机制为多路网络分配不同权重,再对损失函数进行约束以限制遮挡支路权重始终小于眼部支路权重,进而得到加权后的支路特征并通过特征融合形成全局特征。在公开数据集FERplus上实验表明,模型能够大幅度提升人脸表情识别的准确率,有效缓解复杂场景下因遮挡造成的信息损失问题。
In order to solve the occlusion problem of students’ facial expression recognition in common class-room scenes, the original image is divided into multiple face images through partial segmentation and random occlusion strategies, the same residual network is used to extract features, the self attention mechanism is used to assign different weights to the multiple networks, and then the loss function is constrained to limit the weight of the occlusion branch to always be less than the weight of the eye branch. Then, the weighted branch features are obtained and global features are formed through feature fusion. Experiments on the public dataset FERplus have shown that the model can significantly improve the accuracy of facial expression recognition and effectively alleviate the problem of information loss caused by occlusion in complex scenes.
[1] | Zhang, H., Su, W., Yu, J., et al. (2020) Weakly Supervised Local-Global Relation Network for facial Expression Recog-nition. Proceedings of Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence (IJCAI-PRICAI-2020), Yokohama, 7-11 July 2020, 1040-1046. https://doi.org/10.24963/ijcai.2020/145 |
[2] | Fernandez, P.D.M., Pe?a, F.A.G., Ren, T.I. and Cunha, A. (2019) Feratt: Facial Expression Recognition with Attention Net. Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, 16-17 June 2019, 837-846. https://doi.org/10.1109/CVPRW.2019.00112 |
[3] | Li, Y., Zeng, J., Shan, S. and Chen, X. (2018) Occlusion Aware Facial Expression Recognition Using CNN with Attention Mechanism. IEEE Transactions on Image Processing, 28, 2439-2450.
https://doi.org/10.1109/TIP.2018.2886767 |
[4] | Fard, Ali P. and Mahoor, M.H. (2022) Ad-Corre: Adaptive Corre-lation-Based Loss for Facial Expression Recognition in the Wild. IEEE Access, 10, 26756-26768. https://doi.org/10.1109/ACCESS.2022.3156598 |
[5] | 张海峰. 基于多特征融合的人脸表情识别研究[D]: [博士学位论文]. 合肥: 中国科学技术大学, 2020. |
[6] | Krizhevsky, A., Sutskever, I. and Hinton, G. (2012) ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 60, 84-90. |
[7] | Simonyan, K. and Zisserman, A. (2015) Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of 3rd International Conference on Learning Representations (ICLR 2015), San Diego, 7-9 May 2015, 1-14. |
[8] | Szegedy, C., Wei, L., Jia, Y., et al. (IEEE) Going Deeper with Convolutions. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, 7-12 June 2015, 1-9.
https://doi.org/10.1109/CVPR.2015.7298594 |
[9] | He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recogni-tion, Las Vegas, 27-30 June 2016, 770-778.
https://doi.org/10.1109/CVPR.2016.90 |
[10] | Lee, J., Kim, S., Kim, S., et al. (2019) Context-Aware Emotion Recognition Networks. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Seoul, 27 October-2 November 2019, 10143-10152. https://doi.org/10.1109/ICCV.2019.01024 |
[11] | Liu, X., Guo, Z., Yuan, B. and Guo, H. (2022) Robust Facial Expression Recognition Based on Dual Branch Multi-feature Learning. Proceedings of 2022 7th International Conference on Image, Vision and Computing (ICIVC), Xi’an, 26-28 July 2022, 1-6. https://doi.org/10.1109/ICIVC55077.2022.9886565 |
[12] | Acharya, D., Huang, Z., Paudel, D.P. and Van Cool, L. (2018) Covariance Pooling for Facial Expression Recognition. Proceedings of 2018 IEEE/CVF Conference on Com-puter Vision and Pattern Recognition Workshops, Salt Lake City, 18-22 June 2018, 367-374. https://doi.org/10.1109/CVPRW.2018.00077 |
[13] | Zhou, H., Meng, D., Zhang, Y., et al. (2019) Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition. Proceedings of 2019 International Conference on Multimodal Interaction, Suzhou, 14-18 October 2019, 562-566. https://doi.org/10.1145/3340555.3355713 |
[14] | Wang, C., Wang, S. and Liang, G. (2019) Identity- and Pose-Robust Facial Expression Recognition through Adversarial Feature Learning. Proceedings of the 27th ACM Inter-national Conference on Multimedia Interaction, Nice, 21-25 October 2019, 238-246. https://doi.org/10.1145/3343031.3350872 |
[15] | Zhong, L., Bai, C., Li, J., et al. (2019) A Graph-Structured Repre-sentation with BRNN for Static-based Facial Expression Recognition. Proceedings of 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition, Lille, 14-18 May 2019, 1-5. https://doi.org/10.1109/FG.2019.8756615 |
[16] | Barsoum, E., Zhang, C., Ferrer, C.C., et al. (2016) Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution. Proceedings of the 2016 Interna-tional Conference on Multimodal Interaction, Tokyo, 12-16 November 2016, 279-283. https://doi.org/10.1145/2993148.2993165 |
[17] | Abdolrashidi, A. (2021) Deep-Emotion: Facial Expression Recog-nition Using Attentional Convolutional Network. Sensors, 21, Article 3046. https://doi.org/10.3390/s21093046 |
[18] | Siqueira, H., Magg, S. and Wermter, S. (2020) Efficient Facial Feature Learning with Wide Ensemble-Based Convolutional Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 5800-5809.
https://doi.org/10.1609/aaai.v34i04.6037 |