|
现代图书情报技术 2005
The Algorithm of Forecasting URL-Topic Based on Web Structure and Web Page Contents
|
Abstract:
This paper introduces primarily a core Algorithm of Web topic information gathering system that we designed--the Forecast URL - Topic Algorithm. It bases on the related theories, analyzes the experiment data and discovers the topic of the hyperlink be decided by three factors primarily: the topic Similarity of the parent Web page, the topic Similarity of the (ex - ) anchor text and the structure characteristic of Web graph, then puts forward the algorithm of Forecasting URL - Topic based on Web structure and Web page contents, the system evaluation result shows that the algorithm has great efficiency.