全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Content-based Text Categorization using Wikitology

Keywords: Text Categorization , Machine Learning , Wikitology , Support Vector Machine , 20- Newsgroup. Reuters-21578 , IJCSI

Full-Text   Cite this paper   Add to My Lib

Abstract:

The process of text categorization assigns labels or categories to each text document according to the semantic content of the document. The traditional approaches to text categorization used features from the text like: words, phrases, and concepts hierarchies to represent and reduce the dimensionality of the documents. Recently, researchers addressed this brittleness by incorporating background knowledge into document representation by using some external knowledge base for example WordNet, Open Project Directory (OPD) and Wikipedia. In this paper we have tried to enhance text categorization by integrating knowledge from Wikitology. Wikitology is a knowledge repository which extracts knowledge from Wikipedia in structured/unstructured forms with a warping of ontological structure. We have augmented text document by exploring Wikitology fields like: {Bag of Words, titles, redirects, entity types, categories and linked entities}. We also propose and evaluate different text representations and text enrichment technique. The classification is performed by using Support Vector Machine (SVM and we have validated this experiment on 4-fold cross-validation.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133