|
计算机科学技术学报 2009
Self-Switching Classification Framework for Titled DocumentsKeywords: text analysis,machine learning,Web text analysis Abstract: Ambiguous words refer to words that have multiple meanings such as apple,window.In text classification they are usually removed by feature reduction methods like Information Gain.Sometimes there are too many ambiguous words in the corpus,which makes throwing away all of them not a viable option,as in the case when classifying documents from the Web.In this paper we look for a method to classify Titled documents with the help of ambiguous words.Titled documents are a kind of documents that have a simple stru...
|