|
计算机应用 2006
Retrieval Algorithm Based on Hyperlinks and content similarity
|
Abstract:
Under the circumstances of web, classical hyperlink analysis algorithms(such as HITS algorithm) mainly focused on the authority of a web page rather than its topic, so it was easy to drift away from the mining topic when traversing the hyperlinks. The cause of topic drifting away in HITS algorithm was analyzed. By combining the topic analysis method with the content relevance evaluation, a novel web information retrieval algorithm - WHITS was presented. Experiment results show that WHITS focuses on mining the potentially semantic relationship between hyperlinks and performs quite well in the topic-specific crawling.