%0 Journal Article %T WAN-Based Distributed Web Crawling
广域网分布式Web 爬虫 %A XU Xiao %A ZHANG Wei-Zhe %A ZHANG Hong-Li %A FANG Bin-Xing %A
许 笑 %A 张伟哲 %A 张宏莉 %A 方滨兴 %J 软件学报 %D 2010 %I %X There are three core issues recognized for WAN-based distributed Web crawling systems: Web Partition, Agent collaboration and Agent deployment. Centering around these issues, this paper presents a comprehensive overview of the current strategies adopted by academic and business communities. The experiences, problems and challenges encountered by the WAN-based distributed Web crawlers are classified and discussed in depth. A summary of the current evaluation indicators is also given. Finally, conclusion and some suggestions for future research are put forward. %K search engine %K WAN-based distributed crawling %K Web partition %K agent collaboration %K agentdeployment
搜索引擎 %K 广域网分布式爬虫 %K Web %K 划分 %K Agent %K 协同 %K Agent %K 部署 %U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=514521740A91E64D385D3BD9A0356FE1&yid=140ECF96957D60B2&vid=659D3B06EBF534A7&iid=E158A972A605785F&sid=0AFD076159674A31&eid=FA63B973BAB5E93D&journal_id=1000-9825&journal_name=软件学报&referenced_num=1&reference_num=34