全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
软件学报  2008 

Confusion Class Discrimination Techniques for Text Classification
面向文本分类的混淆类判别技术

Keywords: text classification,confusion class discrimination,feature selection,classification error distribution,machine learning,natural language processing
文本分类
,混淆类判别,特征选取,分类错误分布,机器学习,自然语言处理

Full-Text   Cite this paper   Add to My Lib

Abstract:

This paper analyzes confusion class phenomena existing in text classification procedure, and studies further confusion class discrimination techniques to improve the performance of text classification. In this paper, firstly a technique for confusion class recognition based on classification error distribution is proposed to recognize confusion class sets existing in the pre-defined taxonomy. To effectively discriminate confusion classes, this paper proposes an approach to feature selection based on discrimination capability in the procedure of which each candidate feature's discrimination capability for class pair is evaluated. At last, two-stage classifiers are used to integrate baseline classifier and confusion class classifiers, and in which the two output results from two stages are combined into the final output results. The confusion class classifiers in the second stage could be activated only when the output class of the input text assigned by baseline classifier in the first stage belongs to confusion classes, then the confusion class classifiers are used to discriminate the testing text again. In the comparison experiments, Newsgroup and 863 Chinese evaluation data collection are used to evaluate the effectiveness of the techniques proposed in this paper, respectively. Experimental results show that the methods could improve significantly the performance for single-label and multi-class classifier (SMC).

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133