|
智能交互系统的设计与实现:基于人脸识别与语音识别技术
|
Abstract:
本文介绍了一种基于人脸识别与语音识别技术相融合的多模态智能交互系统。该系统由人脸识别模块及语音识别模块两大部分组成。通过集成openMV摄像头、麦克风阵列以及openMV IDE软件环境,开发一种多模态系统,该系统能够实现特征点提取与检测,并结合这些功能进行语音增强、语音识别和人脸识别。openMV摄像头进行图像采集,并在openMV IDE软件端执行特征点检测算法,捕捉用户的面部特征,实现身份验证和用户信息的获取。同时,麦克风阵列将负责捕获声音信号。语音增强模块通过运用基于时频卷积网络(TFCN)的轻量级语音增强算法,抑制背景噪声,保持目标语音的失真尽可能低,实现对目标语音的增强。语音识别模块实现了从语音到文本的转换,提升系统的智能化水平。该系统可广泛应用于智能家居领域,具体来说,可以应用于智能门锁,该系统可以自动识别家庭成员的面孔,实现无钥匙进入。此外,语音识别模块可以识别出特定的语音命令,如“开门”或“关门”,从而进一步增加智能门锁的便捷性和安全性。实验结果表明,本智能交互系统通过融合人脸识别与语音识别技术,成功开发了一种多模态智能交互系统。这一集成化的设计不仅体现了系统的高效性和稳定性,更预示了该系统在未来广泛应用中的巨大潜力和实用价值。
This paper introduces a multi-modal intelligent interaction system based on the integration of face recognition and speech recognition technology. The system consists of two parts: face recognition module and voice recognition module. By integrating the openMV camera, microphone array and openMV IDE software environment, a multi-modal system is developed, which can realize feature point extraction and detection, and combine these functions for voice enhancement, speech recognition and face recognition. The openMV camera collects images and executes a feature point detection algorithm on the openMV IDE software to capture the user’s facial features and realize authentication and user information acquisition. At the same time, the microphone array will be responsible for capturing the sound signal. The speech enhancement module uses a lightweight speech enhancement algorithm based on time-frequency convolution network (TFCN) to suppress background noise, keep the distortion of the target voice as low as possible, and realize the enhancement of the target voice. The speech recognition module realizes the conversion from voice to text and improves the intelligent level of the system. The system can be widely used in the field of smart home. Specifically, it can be applied to smart door locks. The system can automatically identify the faces of family members and achieve keyless entry. In addition, the voice recognition module can recognize specific voice commands, such as “open the door” or “close the door”, thus further increasing the convenience and security of the smart door lock. The experimental results show that this intelligent interaction system has successfully developed a multi-modal intelligent interaction system by integrating face recognition and speech recognition technology. This integrated design not only reflects the efficiency and stability of the system, but also indicates the great potential and practical value of the system in its wide application in the future.
[1] | 黄玲, 王霄, 邵健, 胡娟, 张译. 基于NodeMCU智能语音交互家居系统设计[J]. 智能计算机与应用, 2021, 11(2): 164-168, 173. |
[2] | 李玲俐. 基于深度学习理论的人脸识别技术应用综述[J]. 计算机与数字工程, 2021, 49(9): 1912-1914, 1929. |
[3] | 赵远东. 基于深度神经网络的人脸识别方法研究[D]: [硕士学位论文]. 沈阳: 沈阳工业大学, 2019. |
[4] | 闫新宝, 蒋正锋. 基于VGGNet深度卷积神经网络的人脸识别方法研究[J]. 电脑知识与技术, 2023, 19(25): 34-37. https://doi.org/10.14004/j.cnki.ckt.2023.1370 |
[5] | 杨涛. 基于机器学习的语音增强技术[J]. 电声技术, 2024, 48(3): 39-41. https://doi.org/10.16311/j.audioe.2024.03.013 |
[6] | 黄修正. 基于可编程SoC的卷积神经网络数字解调信号语音增强[D]: [硕士学位论文]. 北京: 北京交通大学, 2023. https://doi.org/10.26944/d.cnki.gbfju.2023.000575 |
[7] | 魏磊. 基于CNN的语音增强算法的研究与FPGA实现[D]: [硕士学位论文]. 成都: 电子科技大学, 2021. |
[8] | 屈瑾. 基于语音识别的智能交互系统设计[J]. 自动化与仪器仪表, 2023(1): 221-225. https://doi.org/10.14016/j.cnki.1001-9227.2023.01.221 |
[9] | 刘搏飞, 刘春池, 邢晓鹏, 隋盛誉, 孙嘉成, 李广凯, 谢印庆. 基于人工智能与物联网技术的家居门禁系统[J]. 物联网技术, 2022, 12(9): 117-118, 121. |
[10] | 廖玥灵, 马敏耀, 令狐蓉, 等. 基于面部识别的新型智能门禁系统设计与实现[J]. 无线互联科技, 2022, 19(20): 49-51. |
[11] | 赵慧, 张伟, 郝喆. 基于OpenMV视觉模块的人脸识别监控系统研究[J]. 信息化研究, 2022, 48(1): 55-58. |
[12] | 王宝妮, 包艳艳, 倪子越. 基于STM32的语音信号处理与传输技术研究[J]. 产业创新研究, 2024(6): 112-114. |
[13] | 相增辉, 张国梁, 庞渊源, 等. 基于深度卷积神经网络的智能机器人语音自动识别方法[J]. 自动化技术与应用, 2024, 43(4): 43-46. https://doi.org/10.20033/j.1003-7241.(2024)04-0043-04 |
[14] | 孙思雨, 张海剑, 陈佳佳. 基于傅里叶卷积的多通道语音增强[J]. 无线电工程, 2024, 54(3): 580-588. http://kns.cnki.net/kcms/detail/13.1097.TN.20230901.1943.018.html |
[15] | Jia, X. and Li, D. (2022) TFCN: Temporal-Frequential Convolutional Network for Single-Channel Speech Enhancement. arXiv: 2201.00480. https://doi.org/10.48550/arXiv.2201.00480 |
[16] | 王臣. 基于深度学习的人脸识别方法的探究[J]. 数字通信世界, 2020(7): 169-170. |