全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

双语主题跨语言伪相关反馈

DOI: 10.13190/jbupt.201304.81.wangxw, PP. 81-84

Keywords: 伪相关反馈,潜在狄利克雷分配,双语主题,跨语言信息检索,查询扩展

Full-Text   Cite this paper   Add to My Lib

Abstract:

面向跨语言信息检索任务提出了一个引入双语主题的跨语言伪相关反馈模型.将潜在狄利克雷分配模型扩展为能同时对双语文档建模的主题模型,其中每个主题既可以生成源语言词项,也可以生成目标语言词项;为查询式选择相关的双语主题,并利用其中的相关词项对查询翻译进行优化扩展,获得用于二次检索的新查询.实验结果表明,基于该反馈模型的跨语言检索效果优于其他基于单语主题模型和向量空间模型等反馈策略.

References

[1]  Blei D M, Jordan M J. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003(3): 993-1022.
[2]  Wei Xing, Bruce W. LDA-based document models for Ad-hoc retrieval[C]//Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2006). Seattle: ACM, 2006: 178 - 185.
[3]  Wang Ai, Li Yaodong, Wang Wei. Crosslanguage information retrieval based on LDA[C]//IEEE International Conference on Intelligent Computing and Intelligent Systems(ICIS 2009). Shanghai: IEEE, 2009: 485-490.
[4]  Ye Zheng, Huang Xiangji, Lin Hongfei. Finding a good query-related topic for boosting pseudo relevance feedback[J]. Journal of the American Society for Information Science and Technology, 2011, 62(4): 748-760.
[5]  Wang Xuwen, Zhang Qiang, Wang Xiaojie, et al. LDA based pseudo relevance feedback for cross language information retrieval[C]//IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS2012). Hangzhou: IEEE, 2012: 1993-1998.
[6]  Mimno D, Wallach H M, Naradowsky J, et al. Polylingual topic models[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP2009). Singapore: ACL, 2009: 880-889.
[7]  Ivan Vuli'c, Wim De Smet, Marie-Francine Moens. Identifying word translations from comparable corpora using latent topic models[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: shortpapers(ACL2011). Portland, Oregon: ACL, 2011: 479-484.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133