|
- 2016
复杂噪声场景下的活动语音检测方法
|
Abstract:
该文提出一种适用于各种复杂噪声场景下的鲁棒性活动语音检测方法。采用能量、主频率分量和短时谱熵3种声学参数形成三维特征,这3种参数在各种各样的噪声中表现出很强的互补性;在活动语音脉冲检测中,采用K均值聚类算法自适应地选择特征并且计算语音检测过程中所用到的阈值。在美国国家标准与技术研究院说话人评测2008和2012年任务上进行实验,结果表明:所提出的方法在各种不同噪声环境下均具有较好的性能,相比传统的非监督和有监督活动语音检测算法更加鲁棒高效。
Abstract:A voice activity detection (VAD) algorithm was developed for robust voice detection in complex noise conditions. The energy, the most dominant component and the spectral entropy are used to form three dimensional features that have been demonstrated to strongly complement each of them in the presence of complex noise. The K-mean algorithm is used to adaptively select the feature and to calculate the utterance dependent thresholds, which are applied in the following speech detection process. Tests on the NIST SRE 2008 and 2012 corpus show that this algorithm gives better performance for different noise conditions and is more robust and efficient than conventional unsupervised and supervised methods.