全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
软件学报  2000 

Extracting Semi-Structured Information from the WEB
从WEB文档中构造半结构化信息的抽取器

Keywords: Heuristics rule,data extracting format,object exchange model
启发式规则
,数据抽取格式,对象交换模型.

Full-Text   Cite this paper   Add to My Lib

Abstract:

In order to integrate and query irregular and dynamic information on WEB in a database-like fashion, the authors use object exchange model (OEM) to construct information model of WEB in this paper. To express each component of pages as an OEM object, the authors design an algorithm which extracts semi-structured data from HTML pages, and the testing results are given. This method can extract structured and semi-structured data. It has better applicability than other existing methods.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133