|
计算机科学 2007
Research on Comarison Strategy of Mirror Pages Based on Hyper-Links
|
Abstract:
There are many duplicated pages in Web. These mirrors of pages will distort the analysis result. The duplicates also occupy much space and resources, degrading system efficiency. How to delete these duplicates is a very important issue. The thesis analyzes the deleting method of duplicated pages based on hyper-links and proves that only neighboring comparison is required. The neighbor comparing method is proposed according the Web distribution on out-degree. The result of experiment shows that the comparing amount has been cut down dramatically and the computing efficiency is improved.