%0 Journal Article
%T WAN-Based Distributed Web Crawling
广域网分布式Web 爬虫
%A XU Xiao
%A ZHANG Wei-Zhe
%A ZHANG Hong-Li
%A FANG Bin-Xing
%A
许 笑
%A 张伟哲
%A 张宏莉
%A 方滨兴
%J 软件学报
%D 2010
%I
%X There are three core issues recognized for WAN-based distributed Web crawling systems: Web Partition, Agent collaboration and Agent deployment. Centering around these issues, this paper presents a comprehensive overview of the current strategies adopted by academic and business communities. The experiences, problems and challenges encountered by the WAN-based distributed Web crawlers are classified and discussed in depth. A summary of the current evaluation indicators is also given. Finally, conclusion and some suggestions for future research are put forward.
%K search engine
%K WAN-based distributed crawling
%K Web partition
%K agent collaboration
%K agentdeployment
搜索引擎
%K 广域网分布式爬虫
%K Web
%K 划分
%K Agent
%K 协同
%K Agent
%K 部署
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=514521740A91E64D385D3BD9A0356FE1&yid=140ECF96957D60B2&vid=659D3B06EBF534A7&iid=E158A972A605785F&sid=0AFD076159674A31&eid=FA63B973BAB5E93D&journal_id=1000-9825&journal_name=软件学报&referenced_num=1&reference_num=34