全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
软件学报  2010 

Solution for Automatic Web Review Extraction
一种Web评论自动抽取方法

Keywords: Web user review,structured data record,Web data extraction
Web用户评论
,结构化数据记录,Web数据抽取

Full-Text   Cite this paper   Add to My Lib

Abstract:

Web user reviews are the important information source for many popular applications (e.g. monitoring and analysis of public opinion), and they need to be extracted accurately from Web pages. Web user reviews belong to user-generated contents, whose presentation is not restricted by the Web page template. Therefore new challenges are raised. First, the inconsistency of review contents on both DOM tree and visual appearance impair the similarity between review records; second, the review content in a review record corresponds to a complicated subtree rather than one single node in the DOM tree. To tackle these challenges, a comprehensive solution is proposed to perform automatic extraction of Web reviews by employing sophisticated techniques. The review records are extracted from Web pages based on the level-weighted tree similarity algorithm first, and then, the pure review contents in records are extracted by comparing the node consistency. The experimental results on news Web sites and forum Web sites indicate that our solution can achieve high extraction accuracy and efficiency.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133