全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Innovating Web page classification through reducing noise
Innovating Web Page Classification Through Reducing Noise

Keywords: Web page classification,similarity measure,classification algorithm without noise
降噪
,Web页分类,计算机网络

Full-Text   Cite this paper   Add to My Lib

Abstract:

This paper presents a new method that eliminates noise in Web page classification. It first describes the presentation of a Web page based on HTML tags. Then through a novel distance formula, it eliminates the noise in similarity measure. After carefully analyzing Web pages, we design an algorithm that can distinguish related hyperlinks from noisy ones. We can utilize non-noisy hyperlinks to improve the performance of Web page classification (the CAWN algorithm). For any page, we can classify it through the text and category of neighbor pages related to the page. The experimental results show that our approach improved classification accuracy. This work is supported by the National Natural Science Foundation of China (No.60075019, and No.9010402) and the National Science Foundation of Beijing (No.4011003). LI Xiaoli received his Ph.D. degree from the Institute of ComputingTechnology, The Chinese Academy of Sciences in 2001. He taught artificial intelligence in the Graduate School of the University of Science and Technology of China in 1999. His research interests include Web mining, information retrieval and natural language processing. He has published more than 20 papers in international conferences and journals Since 2000, he has been working as a research staff in the National University of Singapore. SHI Zhongzhi received his B.E. and M.E. degrees from the University of Science and Technology of China in 1964 and 1968, respectively. He is currently the Executive Director of the Department of Intelligent Computer Science, Institute of Computing Technology. His research interests include artificial intelligence, neural computing, cognitive science, advanced database technology, new generation computer. He has published 10 books and more than 300 technical papers. He is a member of the Standing Steering Committee of PRICAI, Vice President of Chinese Artificial Intelligence Society, and Secretary-General of China Computer Federation. He is also the Vice President of the Chinese Society of Machine Learning and Vice President of the Chinese Society of Knowledge Engineering.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133