全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Record-Level Information Extraction from a Web Page based on Visual Features

Full-Text   Cite this paper   Add to My Lib

Abstract:

Web databases contain a huge amount ofstructured data which are easily obtained via their queryinterfaces only. Query results are presented indynamically generated web pages, usually in the form ofdata records, for human use. Decisive for web dataintegration applications is the problem of automaticallyextracting data records from query result pages, such ascomparison shopping sites, meta-search engines, etc. Anumber of approaches to query result extraction havebeen proposed. As the structures of web pages becomemore critical, these approaches start to fail. Query resultpages usually also contain other types of information inaddition to query results, e.g., advertisements, navigationbar, etc. Most of the existing approaches do not move outsuch impertinent contents which may affect the accuracyof data record extraction. We have observed that queryresults are usually displayed in regular visual patternsand terms used in a query often reappear in query results.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133