%0 Journal Article
%T 基于DNN-LSTM的VAD算法&lt;br&gt;DNN-LSTM based VAD algorithm
%A 张雪英
%A 牛溥华
%A 高帆
%J 清华大学学报（自然科学版）
%D 2018
%R 10.16511/j.cnki.qhdxxb.2018.25.022
%X 基于深度神经网络（deep neural network，DNN）的语音活动性检测（voice activity detection，VAD）忽略了声学特征在时间上的相关性，在带噪环境下性能会明显下降。该文提出了一种基于深度神经网络和长短时记忆单元（long-short term memory，LSTM）的混合网络结构应用于VAD问题。进一步对语音帧的动态信息加以分析利用，同时结合DNN-LSTM结构使用一种基于上下文信息的代价函数用于网络训练。实验语料基于TIDIGITS语音库，使用Noisex-92噪声库加噪。实验结果表明：在不同噪声环境下基于DNN-LSTM的VAD方法比基于DNN的VAD方法性能更好，新的代价函数比传统的代价函数更适用于该文提出的算法。&lt;br&gt;Abstract：Voice activity detection (VAD) algorithms based on deep neural networks (DNN) ignore the temporal correlation of the acoustic features between speech frames which significantly reduces the performance in noisy environments. This paper presents a hybrid deep neural network with long-short term memory (LSTM) for VAD analyses which utilizes dynamic information from the speech frames. A context information based cost function is used to train the DNN-LSTM network. The noisy speech corpus used here was based on TIDIGITS and Noisex-92. The results show that the DNN-LSTM based VAD algorithm has better recognition accuracy than DNN-based VAD algorithms in noisy environment which shows that this cost function is more suitable than the traditional cost function.
%K 语音活动性检测(VAD)
%K 深度神经网络(DNN)
%K 长短时记忆单元(LSTM)
%K &lt
%K br&gt
%K voice activity detection
%K deep neural network
%K long-short term memory
%U http://jst.tsinghuajournals.com/CN/Y2018/V58/I5/509