|
电子与信息学报 2004
Dynamic Channel Compensation Based on Statistical Model for Mandarin Speech Recognition over Telephone
|
Abstract:
Automatic speech recognition in telecommunications environment still has a lower correct rate compared to its desktop pairs. Improving the performance of telephonequality speech recognition is an urgent problem for its application in those practical fields.Previous works have shown that the main reason for this performance degradation is the var ational mismatch caused by different telephone channels between the testing and training sets. In this paper, they propose an efficient implementation to dynamically compensate this mismatch based on a phone-conditioned prior statistic model for the channel bias.This algorithm uses Bayes' rule to estimate telephone channels and dynamically follows the time-variations within the channels. In their experiments on mandarin Large Vocabulary Continuous Speech Recognition (LVCSR) over telephone lines, the average Character Error Rate (CER) decreases more than 27% when applying this algorithm; in short utterance test,the Word-Error-Rate(WER) relatively reduced 30%. At the same time, the structural delay and computational consumptions required by this algorithm are limited. The average delay is about 200 ms. So it could be embedded into practical telephone-based applications.