|
计算机应用研究 2011
Deep Web result pattern extracting based on heuristic information
|
Abstract:
Extracting schema information is the necessary step in the Deep Web data research, to address the loss problem of Deep Web result schema information, this paper proposed a novel approach Deep Web result pattern extracting based on heuristic information. Through analyzing Deep Web result page data and adding correct attribute names to result pages data by heuristic information, it obtained the corresponding of Deep Web result pattern. Moreover, it solved the structure conflict by standardized treatment. Experimental results show that the method can effectively extract result pattern.