全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Search Strategy and Achieve of the Topic Search Engine Spider
主题搜索引擎网络爬虫搜索策略的研究与实现

Keywords: cyber worm,search engine,theme correlativity,genetic algorithm,crawl
网络爬虫
,搜索引擎,主题相关,遗传,抓取

Full-Text   Cite this paper   Add to My Lib

Abstract:

According to the characteristics of the cyber page structure, this paper proposes the theme which predicts the correlativity by delivering the theme among the pages, and solves the problems of channel jamming and capture omission. Firstly, a correlative information value is delivered according to the anchor text. If the information given by the anchor text is correlated, the correlative threshold will be delivered directly. Otherwise, it will be multiplied by the genetic ratio before delivery. In the process of the delivery, correlative information value may be reset to the initial value if it encounters the correlative Web page. At last, the recall ratio is proven to be greatly improved based on the experimental result.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133