全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

一种基于双重学习模型的可视语音合成系统

Keywords: 遗传算法,隐马尔可夫模型,语音合成,特征提取,语音处理,语音识别

Full-Text   Cite this paper   Add to My Lib

Abstract:

为了在可视语音合成中获得更具有真实感的口型动画,提出了一种基于双重学习模型的合成方法.通过隐马尔可夫模型和遗传算法相结合的方法,可以更好地学习出语音特征与可视特征间的映射关系.该模型能去除传统语音识别领域在对大样本语音空间提取语音特征时的冗余信息,达到更好的可视语音预测效果.另外,在口型特征的表示上提出了一种基于面部动画参数特征点的几何特征表示,不仅对在不一致的光照条件下获得的训练样本有较好的鲁棒性,能更好地表征口型本身变化,而且与传统的主成分分析特征相比,具有较小的向量维数,提高了训练和合成速度.

References

[1]  ZOLNAY A,SCHLUTER R,NEY H.Acoustic feature combination for robust speech recognition[C]//Proceedings of Int Conf on Acoustics,Speech,and Signal Processing,2005.Philadelphia,USA:IEEE,2005:457-460.
[2]  SCHLUTER R,NEY H.Using phase spectrum information for improved speech recognition performance[C]//Proc IEEE Int Conf on Acoustics,Speech and Signal Processing.Salt Lake City:IEEE,2001:133-136.
[3]  LEWIS J P,PARKER F.Automated lip-synch and speech synthesis for character animation[C]//CHI/GI 1987 Conference Proceedings on Human Factors in Computing Systems and Graphics Interface.Toronto:ACM,1987:143-147.
[4]  AVERSANO G,ESPOSITO A,MARINARO M.A new text-independent method for phoneme segmentation[C]// Proceedings of IEEE Midwest Symposium on Circuits and Systems.Dayton:IEEE,2001:516-519.
[5]  王小平,曹立明.遗传算法[M].西安:西安交通大学出版社,2003.
[6]  COSATTO E,OSTERMANN J,GRAF H,et al.Lifelike talking faces for interactive services[J].Proceedings of the IEEE,2003,91(9):1406-1429.
[7]  EZZAT T,GEIGER G,POGGIO T.Trainable video realistic speech animation[C]//SIC,GRAPH 2002.San Antonio: ACM,2002:389-398.
[8]  王志明,蔡莲红,吴志勇,等.汉语文本-可视语音转换的研究[J].小型微型计算机系统,2002,23(4):474-477. WANG Zhi-ming,CAI Lian-hong,WU Zhi-yong,et al.Study of text to visual speech in Chinese[J].Mini-micro System, 2002,23(4):474-477.(in Chinese)
[9]  陈益强,高文,王兆其,等.基于机器学习的语音驱动人脸动画方法[J].软件学报,2003,14(2):215-221. ??CHEN Yi-qiang,GAO Wen,WANG Zhao-qi,et al.A speech driven face animation system based on machine learning[J]. Journal of Software,2003,14(2):215-221.(in Chinese)
[10]  COHEN M,MASSARO D,CLARK R.Training a talking head[C]//IEEE 4th Int Conf on Multimodal Interfaces. Pittsburgh:IEEE,2002:499-504.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133