%0 Journal Article %T Ajax crawling algorithm based on state transition graph
一种基于状态转换图的Ajax爬行算法 %A GUO Hao %A LU Yu-liang %A LIU Jin-hong %A
郭浩 %A 陆余良 %A 刘金红 %J 计算机应用研究 %D 2009 %I %X Traditional Web crawler could not meet the challenges of crawling Ajax application, such as JavaScript execution, state identification and navigation, duplicate states elimination etc.By exploring such challenges,this paper introduced state transition graph, based on which an algorithm was proposed to retrieve Ajax states and the background Deep Web. In order to uplift the accuracy,reduce the unnecessary states,improved the algorithm by Ajax fingerprinting and DOM filtering. The experimental results indicate the effectivity and efficiency of this algorithm. %K Ajax crawler %K state transition graph %K Web crawler %K Deep Web
Ajax爬虫 %K 状态转换图 %K Web爬虫 %K Deep %K Web %U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=A9D9BE08CDC44144BE8B5685705D3AED&aid=06A42C33940BAEFB141920E72258B91E&yid=DE12191FBD62783C&vid=96C778EE049EE47D&iid=708DD6B15D2464E8&sid=FED0C3D3EAC828A8&eid=194903A9E04A8E69&journal_id=1001-3695&journal_name=计算机应用研究&referenced_num=0&reference_num=6