|
- 2016
面向旅游人文信息集成的Web数据源选择
|
Abstract:
摘要: 人文信息集成对提升一个景点的文化内涵有重要意义,为提升集成数据的效用和效率,提出了一种面向人文信息集成的数据源选择策略。基于名人、人文主题、信息长度和标记词构建人文信息摘要;基于人物扩展策略丰富人文摘要内容;基于名人人文信息增量设计了相应的数据源选择策略。利用领域数据集进行实验的结果表明所提方法准确率较高。
Abstract: Humanities information integration is import to enhance the cultural connotation of a landscape. To enhance the effectiveness and efficiency of data integration, we propose a data source selection strategy for humanities-oriented information integration. First, building a humanities information summary based on celebrities, cultural themes, message length and mark words; Second, proposing an expansion strategy to rich cultural content of the summary; Finally, selecting data sources based on information gain of celebrities. We conduct a number of experiments based on the data collections of tourism, and the result shows that our methods accuracy is high
[1] | 万常选,邓松,刘德喜,等. 面向混合类型关键词查询的非合作结构化深网数据源选择[J]. 计算机研究与发展, 2014, 51(4):905-917. WAN Changxuan, DENG Song, LIU Dexi,et al. Non-cooperative structured deep web selection based on hybrid type keyword retrieval[J].Journal of Computer Research and Development, 2014, 51(4):905-917. |
[2] | NGUYEN K, CAO J. K-Graphs: selecting top-<i>k </i>data sources for XML keyword queries[C] // Proceedings of the 22nd International Conference on Database and Expert Systems Applications. Heidelberg: Springer-Verlag, 2011: 425-439. |
[3] | 朱冠胜, 黄浩, 杨卫东. XML关键字检索系统的数据源选择[J].小型微型计算机系统, 2012, 33(6):1183-1188. ZHU Guansheng, HUANG Hoying Weidong. Keyword search based XML data source selection[J]. Journal of Chinese Computer Systems, 2012, 33(6):1183-1188. |
[4] | MARKOV I, AZZOPARDI L, CRESTANI F. Reducing the uncertainty in resource selection[C] // Proceedings of the 35th European Conference on IR Research. Heidelberg: Springer-Verlag, 2013: 507-519. |
[5] | HONG D, SI L. Search result diversification in resource selection for federated search[C] // Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2013: 613-622. |
[6] | 吴友政, 赵军, 徐波. 基于主题语言模型的句子检索算法[J]. 计算机研究与发展, 2007, 44(2):288-295. WU Youzheng, ZHAO Jun, XU Bo. Sentence retrieval with a topic-based language model [J].Journal of Computer Research and Development, 2007, 44(2):288-295. |
[7] | 万常选, 邓松, 刘喜平, 等. Web数据源选择技术[J]. 软件学报, 2013, 24(4):781-797. WAN Changxuan, DENG Song, LIU Xiping, et al. Web data source selection technologies[J]. Journal of Software, 2013, 24(4):781-797. |
[8] | BALAKRISHNAN R, KAMBHAMPATI S. SourceRank: relevance and trust assessment for deep Web sources based on inter-source agreement[C] // Proceedings of the 20th International Conference on World Wide Web. New York, ACM, 2011:227-236. |
[9] | REKATSINAS T, DONG X L. Finding quality in quantity: the challenge of discovering valuable sources for integration[C] // Proceedings of the 7th Biennial Conference on Innovative Data Systems Research. New York: ACM, 2015:1-7. |
[10] | 邓松, 万常选, 刘喜平, 等. 基于用户反馈的深网数据源选择[J]. 小型微型计算机系统, 2012, 33(11):2367-2371. DENG Song, WAN Changxuan, LIU Xiping, et al. Selection of deep Web data sources based on user feedback[J]. Journal of Chinese Computer Systems, 2012, 33(11):2367-2371. |
[11] | REKATSINAS T, DONG X L. Characterizing and selecting fresh data sources[C] //Proceedings of the 2014 ACM SIGMOD Intl Conference on Management of Data(SIGMOD 2014). New York:ACM, 2014:919-930. |
[12] | DONG X L, SAHA B, SRIVASTAVA D. Less is more: selecting sources wisely for integration[C] // Proceedings of the 39th International Conference on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers, 2013:37-48. |
[13] | 范举, 周立柱. 基于关键词的深度万维网数据库的选择[J]. 计算机学报, 2011, 34(40):1797-1804. FAN Ju, ZHOU Lizhu. Keyword-based deep web database selection[J]. Chinese Journal of Computers, 2011, 34(40):1797-1804. |
[14] | WANG Y, ZUO W L, HE F L, et al. Ontology-assisted deep Web source selection[J]. Computer Science for Environmental Engineering and EcoInformatics, 2011, 159(2):66-71. |