OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

计算机应用 2006

Implementation of Web information extraction system based on similar pages
基于相似页面的Web信息抽取系统的实现

GONG Zheng-xian,ZHU Qiao-ming,LI Pei-feng,
贡正仙,朱巧明,李培峰

Keywords: RoadRunner
Web页面,相似页面,信息抽取

Full-Text Cite this paper Add to My Lib

Abstract:

The core algorithm of RoadRunner was analyzed. After analyzing the deficiencies of RoadRunner, a Web information extraction system based on similar pages was designed and implemented. The system architecture was introduced, then the key techniques, such as the method for getting similar Web pages, reliably dealing with Web nosy blocks and automatically deducing rules for extracting data items were presented.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

Implementation of Web information extraction system based on similar pages基于相似页面的Web信息抽取系统的实现

Implementation of Web information extraction system based on similar pages
基于相似页面的Web信息抽取系统的实现