|
计算机应用研究 2006
Feature Selection for Neural Network-based Chinese Text Categorization
|
Abstract:
The main problem in the Neural Network(NN) based Chinese text categorization is feature selection for textual data. Feature Selection involves what feature to select and how large the dim of the feature space should be. Aiming at the preceding problem, this paper puts forward a feature selection method using Information Gain(IG) and Principle Component Analysis(PCA). Compare and analyze the categorization performance of different feature selection methods and different feature dims in the experiments. Therefore, the superiority of the proposed feature selection method for NN based Chinese text categorization is proved. The experiments also show that the performance of the NN becomes highest when the feature dim is around 200.