OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

- 2017

THUYG-20:免费的维吾尔语语音数据库
THUYG-20: A free Uyghur speech database

DOI: 10.16511/j.cnki.qhdxxb.2017.22.012

艾斯卡尔·肉孜,殷实,张之勇,王东,艾斯卡尔·艾木都拉,郑方

Keywords: 语音识别,维吾尔语,语料库,深度神经网络(DNN),
speech recognition,Uyghur language,corpus,deep neural network (DNN)

Full-Text Cite this paper Add to My Lib

Abstract:

语音数据资源是语音识别研究的基础。当前国内只有为数不多的开放的语音数据库供研究者免费使用，特别是在维吾尔语等少数民族语音识别方面，数据资源更为贫乏。该文发布一个完全免费的维吾尔语连续语音数据库，该数据库包括约20 h的训练数据和1 h的测试数据，同时介绍了构建维吾尔语语音识别系统所需要的音素集、词表、文本数据等相关资源，以及用于构建基线系统的脚本。给出了该基线系统在纯净测试数据和噪声测试数据上的识别性能。该数据库为维吾尔语语音识别研究提供了可以借鉴的标准数据库。
Abstract：Speech data plays a fundamental role in research on speech recognition. However, there are few open speech databases available for researchers in China, especially for minor languages such as Uyghur. This paper develops a Uyghur continuous speech database which is totally open and free. The database consists of 20 h of training speech and 1 h of test speech, as well as all the resources needed to construct a full Uyghur speech recognition system, including a phone set, lexicon, and text data. A recipe used to construct the baseline system is also described with results for two test sets involving clean speech and noisy speech. This paper provides a standard database for Uyghur speech recognition.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

THUYG-20:免费的维吾尔语语音数据库THUYG-20: A free Uyghur speech database

THUYG-20:免费的维吾尔语语音数据库
THUYG-20: A free Uyghur speech database