OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

计算机科学 2008

A Method of Web Information Extraction Based on Classification Algorithm
一种基于分类算法的网页信息提取方法

WANG Jian-Wei,YANG Dong-Qing GAO Jun WANG Teng-Jiao,
汪建伟,杨冬青,高军,王腾蛟

Keywords: Web information extraction,Attribute vector,Wrapper,Display attributes
信息提取,属性向量,Wrapper,显示属性

Full-Text Cite this paper Add to My Lib

Abstract:

In the research of Web information extraction,most of the existing algorithms are based on HTML structure. As the structure of HTML files changes frequently,wrapper must be updated accordingly. But the update of wrapper needs a lot of domain knowledge. In this paper,a new Web information extraction method based on classification algorithm is provided,which can group the Web text by HTML text display attributes. The information extraction of Web pages is finished by classifying the Web text with different va...

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

A Method of Web Information Extraction Based on Classification Algorithm一种基于分类算法的网页信息提取方法

A Method of Web Information Extraction Based on Classification Algorithm
一种基于分类算法的网页信息提取方法