全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于格空间的受限DeepWeb数据抽取算法

, PP. 130-137

Keywords: 数据抽取,容差关系,形式概念分析,概念格

Full-Text   Cite this paper   Add to My Lib

Abstract:

将返回结果受限的DeepWeb数据源中预测查询结果大小并且抽取的问题转化为概念覆盖问题。首先证明由属性及属性组合产生的集合划分之间为容差关系,进而又证明其构成一个完全格,并且与概念格同态。使用概念间的偏序关系来刻画属性间的相关性,使用概念内涵为查询属性,概念外延为返回结果的预测,基于外延的势剪枝后的概念格为搜索空间,最终提出一种基于格空间的DeepWeb数据抽取算法。实验由可控实验和实际应用实验组成,结果证明该算法理论正确性和现实应用的可行性及有效性。

References

[1]  Bergman M K. The Deep Web: Surfacing Hidden Value (White Paper). Journal of Electronic Publishing, 2001, 7(1): 1-17
[2]  Kautz H A, Selman B, Shah M A. The Hidden Web. AI Magazine, 1997, 18(2): 27-36
[3]  Madhavan J, Afanasiev L, Antova L, et al. Harnessing the Deep Web: Present and Future [EB/OL]. [2009-01-07]. www-db.cs.wise.edu/cidr/cidr2009/Paper_115.pdf
[4]  The DBLP Computer Science Bibliography [DB/OL]. [2010-06-01]. http://www.informatik.uni-trier.de/~ley/db/index.html
[5]  Madhavan J, Ko D, Kot L, et al. Googles Deep Web Crawl. Proc of the VLDB Endowment, 2008, 1(2): 1241-1252
[6]  Chang K C C, He B, Zhang Zhen. Toward Large Scale Integration: Building a MetaQuerier over Databases on the Web // Proc of the 2nd Biennial Conference on Innovative Data Systems Research. Asilomar, USA, 2005: 44-55
[7]  Liu Wei, Meng Xiaofeng, Meng Weiyi. A Survey of Deep Web Data Integration. Chinese Journal of Computers, 2007, 30(9): 1475-1489 (in Chinese) (刘 伟,孟小峰,孟卫一.Deep Web 数据集成研究综述.计算机学报, 2007, 30(9): 1475-1489)
[8]  Wu Ping, Wen Jirong, Liu Huan, et al. Query Selection Techniques for Efficient Crawling of Structured Web Sources //Proc of the 22nd International Conference on Data Engineering. Atlanta, USA, 2006: 47-56
[9]  Vieira K, Barbosa L, Freire J, et al. Siphon++: A Hidden-Web Crawler for Keyword-Based Interfaces // Proc of the 17th ACM Conference on Information and Knowledge Management. Napa Valley, USA, 2008: 1361-1362
[10]  Liu Wei, Meng Xiaofeng, Ling Yanyan. A Graph-Based Approach for Web Database Sampling. Journal of Software, 2008, 19(2): 179-193 (in Chinese) (刘 伟,孟小峰,凌妍妍.一种基于图模型的Web 数据库采样方法.软件学报, 2008, 19(2): 179-193)
[11]  Wang Yan, Lu Jianguo, Chen J. Crawling Deep Web Using a New Set Covering Algorithm // Proc of the 5th International Conference on Advanced Data Mining and Applications. Beijing, China, 2009: 326-337
[12]  Ganter B, Wille R. Formal Concept Analysis: Mathematical Foundations. Berlin, Germany: Springer-Verlag, 1999
[13]  Wang Liming, Zhang Zhuo. Algorithm for Closed Frequent Itemsets Mining Based on Apposition Assembly of Iceberg Concept Lattices. Journal of Computer Research and Development, 2007, 44(7): 1184-1190 (in Chinese) (王黎明,张 卓.基于iceberg概念格并置集成的闭频繁项集挖掘算法.计算机研究与发展, 2007, 44(7): 1184-1190)
[14]  Cimiano P, Hotho A, Staab S. Learning Concept Hierarchies from Text Corpora Using Formal Concept Analysis. Journal of Artificial Intelligence Research, 2005, 24(1): 305-339
[15]  Valtchev P, Missaoui R, Lebrun P. A Partion-Based Approach towards Constructing Galois (Concept) Lattices. Discrete Mathematics, 2002, 256(3): 801-829
[16]  Godin R, Missaoui R, Alaoui H. Incremental Concept Formation Algorithms Based on Galois (Concept) Lattices. Computational Intelligence, 1995, 11(2): 246-267
[17]  Kuznetsov S, Obiedkov S. Comparing Performance of Algorithms for Generating Concept Lattices. Journal of Experimental and Theoretical Artificial Intelligence, 2002, 14(2/3): 189-216

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133