%0 Journal Article
%T Research on Comarison Strategy of Mirror Pages Based on Hyper-Links
基于超链接的镜像页面比较策略研究
%A YANG Nan
%A
杨楠
%J 计算机科学
%D 2007
%I
%X There are many duplicated pages in Web. These mirrors of pages will distort the analysis result. The duplicates also occupy much space and resources, degrading system efficiency. How to delete these duplicates is a very important issue. The thesis analyzes the deleting method of duplicated pages based on hyper-links and proves that only neighboring comparison is required. The neighbor comparing method is proposed according the Web distribution on out-degree. The result of experiment shows that the comparing amount has been cut down dramatically and the computing efficiency is improved.
%K Link analysis
%K Duplicated pages
%K Page resemblance
链接分析
%K 镜像页面
%K 页面相似度
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=64A12D73428C8B8DBFB978D04DFEB3C1&aid=1E8759907DE65478A6ADBAFE2991F64A&yid=A732AF04DDA03BB3&vid=339D79302DF62549&iid=DF92D298D3FF1E6E&sid=8477411EEDB08A86&eid=EFD65B51496FB200&journal_id=1002-137X&journal_name=计算机科学&referenced_num=0&reference_num=7