|
控制理论与应用 2011
Adaptive fusion of acoustic and visual information in noise-robust speech recognition
|
Abstract:
We propose the adaptive fusion of acoustic and visual information for improving the accuracy and the robustness in the speech recognition. The acoustic and visual information is involved in the recognition process with different weights, which are adaptively determined according to the signal-to-noise ratio(SNR) between the environment inputs during the process of recognition. Based on the SNR and the performance feedback, a learning automata is used for computing the adaptive weights for the visual information. A hidden Markov model is used to match the patterns of the acoustic information and the visual information. The hidden Markov model decides the final recognition results by combining the acoustic information and the visual information with optimal weights. Experiments under various noise-level conditions are performed; results show that the speech recognition based on adaptive weights surpasses the speech recognition based on fixed weights.