全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Implementation of Web information extraction system based on similar pages
基于相似页面的Web信息抽取系统的实现

Keywords: RoadRunner
Web页面
,相似页面,信息抽取

Full-Text   Cite this paper   Add to My Lib

Abstract:

The core algorithm of RoadRunner was analyzed. After analyzing the deficiencies of RoadRunner, a Web information extraction system based on similar pages was designed and implemented. The system architecture was introduced, then the key techniques, such as the method for getting similar Web pages, reliably dealing with Web nosy blocks and automatically deducing rules for extracting data items were presented.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133