全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
-  2015 

数据融合在搜索结果多元化上的应用
Search result diversification via data fusion

DOI: 10.6040/j.issn.1671-9352.3.2014.033

Keywords: 权重分配,数据融合,线性组合,检索结果多元化,
weight assignment
,search result diversification,data fusion,linear combination

Full-Text   Cite this paper   Add to My Lib

Abstract:

摘要: 信息检索系统不仅需要考虑文档的相关性,还要考虑文档的多样性和新颖性。针对信息检索结果的多元化问题,探讨了数据融合方法在搜索结果多元化上的适用性。针对线性组合方法,重新考察了成员系统的权重分配策略。通过考虑成员检索系统的有效性和成员检索系统之间的差异性,提出了一种比较简单方便的基于集合覆盖率的方法,使得采用这种权重分配方式的线性组合方法在结果的多样性上能够有所改善。实验采用了3组来自于TREC文本检索会议的针对Web检索多样化任务的数据,实验结果表明在多样性方面,所提出的数据融合方法均能提高检索结果的性能,优于最佳的成员检索系统。
Abstract: Information retrieval systems need to consider both aspects of relevance and diversity for those retrieved documents. To solve the problem of search result diversification, a different perspective was adopted to solve the problem based on a discussion of the application of data fusion method in the search result diversification. Especially for the linear combination method, the weight allocation strategy for component systems was reexamined. Both the effectiveness of component retrieval systems and the dissimilarity of them were concerned, and a simple and convenient method for calculating the dissimilarity was put forward, based on set covering rate. Thereby a linear combination method with such weighting assignment can improve the performance of results in the diversity. Experiments were carried out with 3 groups of top-ranked results submitted to the TREC web diversity task. The result of experiments shows that data fusion is still a useful approach to performance improvement for diversity as for relevance previously

References

[1]  SANTOS R L T, MACDONALD C, OUNIS I. Exploiting query reformulations for web search result diversification[C]// Proceedings of the 19th International Conference on World Wide Web. New York: ACM, 2010: 881-890.
[2]  DANG V, CROFT B W. Term level search result diversification[C]// Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2013: 603-612.
[3]  AKTOLGA E, ALLAN J. Sentiment diversification with different biases[C]// Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2013: 593-602.
[4]  NGUYEN T N, KANHABUA N. Leveraging dynamic query subtopics for time-aware search result diversification[M]// Advances in Information Retrieval. New York: Springer International Publishing, 2014: 222-234.
[5]  WU Shengli, MCCLEAN S. Performance prediction of data fusion for information retrieval[J]. Information Processing & Management, 2006, 42(4):899-915.
[6]  DANG Van, CROFT W B. Diversity by proportionality: an election-based approach to search result diversification[C]// Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2012: 65-74.
[7]  ZHENG Wei, FANG Hui, YAO Conglei, et al. Leveraging integrated information to extract query subtopics for search result diversification[J]. Information Retrieval, 2014, 17(1):52-73.
[8]  ZHENG Wei, FANG Hui. A diagnostic study of search result diversification methods[C]// Proceedings of the 2013 Conference on the Theory of Information Retrieval. New York: ACM, 2013: 17.
[9]  KOHAVI R. A study of cross-validation and bootstrap for accuracy estimation and model selection[C]// Proceedings of the 14th International Joint Conference on Artificial Intelligence. San Mateo: Morgan Kaufmann Publishers, 1995: 1137-1143.
[10]  WU Shengli. Applying statistical principles to data fusion in information retrieval[J]. Expert Systems with Applications, 2009, 36(2):2997-3006.
[11]  AGRAWAL R, GOLLAPUDI S, HALVERSON A, et al. Diversifying search results[C]// Proceedings of the 2nd ACM International Conference on Web Search and Data Mining. New York: ACM, 2009: 5-14.
[12]  CARBONELL J, GOLDSTEIN J. The use of MMR, diversity-based re-ranking for reordering documents and producing summaries[C]// Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 1998: 335-336.
[13]  LEE J H. Analyses of multiple evidence combination[C]// Proceedings of the 20th Annual International ACM SIGIR Conference. New York: ACM, 1997, 31(SI):267-276.
[14]  WU Shengli, BI Yaxin, ZENG Xiaoqin. The linear combination data fusion method in information retrieval[J]. Lecture Notes in Computer Science, 2011, 6861:219-233.
[15]  WANG Jun, ZHU Jianhan. Portfolio theory of information retrieval[C]// Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2009: 115-122.
[16]  ZHAI Chengxiang, COHEN William W, LAFFERTY J. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2003: 10-17.
[17]  SAKAI T, DOU Zhicheng, YAMAMOTO T, et al. Summary of the NTCIR-10 INTENT-2 task: subtopic mining and search result diversification[C]// Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2013: 761-764.
[18]  CLARKE C L A, CRASWELL N, SOBOROFF I, et al. Overview of the TREC 2011 web track[C]// Proceedings of TREC Conference. Gaithersburg:[s.n.]. 2011: 1-9.
[19]  CORMACK G V, CLARKE C L A, BUETTCHER S. Reciprocal rank fusion outperforms condorcet and individual rank learning methods[C]// Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2009: 758-759.
[20]  YIN Xiaoshi, HUANG J X, LI Zhoujun, et al. A survival modeling approach to biomedical search result diversification using Wikipedia[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(6):1201-1212.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133