OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

International Journal of Computer Technology and Electronics Engineering 2012

Record-Level Information Extraction from a Web Page based on Visual Features

A Suresh Babu,1, Dr. P. Premchand2 and Dr. A. Govardhan

Full-Text Cite this paper Add to My Lib

Abstract:

Web databases contain a huge amount ofstructured data which are easily obtained via their queryinterfaces only. Query results are presented indynamically generated web pages, usually in the form ofdata records, for human use. Decisive for web dataintegration applications is the problem of automaticallyextracting data records from query result pages, such ascomparison shopping sites, meta-search engines, etc. Anumber of approaches to query result extraction havebeen proposed. As the structures of web pages becomemore critical, these approaches start to fail. Query resultpages usually also contain other types of information inaddition to query results, e.g., advertisements, navigationbar, etc. Most of the existing approaches do not move outsuch impertinent contents which may affect the accuracyof data record extraction. We have observed that queryresults are usually displayed in regular visual patternsand terms used in a query often reappear in query results.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133