|
计算机科学 2007
Website Crawling for Specific Topics
|
Abstract:
In this paper, we propose a new approach to discover the Websites for special topic in WWW with high precision and low cost. This approach improves traditional Focused Crawler techniques, different from the common Web crawler which accesses the Web graph composed by HTML pages and hyperlinks, our crawler uses Meta-Seareh to get the URLs of relevant page, then uses heuristic search method to reduce the search cost, and uses topic relevant rules to increase the precision. The experimental results show the presented approach is both effective and efficient.