OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

International Journal of Machine Intelligence 2011

CONTENT AND STRUCTURE BASED CLASSIFICATION OF XML DOCUMENTS

SHASHIREKHA H.L, VANISHREE K.S, SUMANGALA N.

Keywords: XML documents , text classification , ‘k’ nearest neighbors , cosine similarity , tree structure

Full-Text Cite this paper Add to My Lib

Abstract:

The ever increasing amount of XML documents available on the World Wide Web demands automated tools and techniques that would make the search and retrieval of XML documents more effective and efficient. Classification of XML documents is one of the significant tasks which are being explored by many researchers in this direction. Due to the presence of inherent structure in the XML documents, conventional text classification methods cannot be used to classify XML documents directly. Hence, there is a need for the development of tools and techniques that automatically classifies XML documents. In this work, we have developed an algorithm based on ‘k’ nearest neighbors to classify XML documents by considering both the content and structure. The developed algorithm is tested on a subset of MEDLINE dataset for different values of ‘k’ and varying size of training set and the results are tabulated.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133