|
计算机科学 2004
A Dynamic Vector Space Model for Internet News Textual Categorization
|
Abstract:
Traditional Vector Space Model does not consider the relationship between features, and is not suitable for dynamic training. Focus on the Internet news with dynamically changing topics and focus, a Dynamic VSM (DVSM) is proposed. Multiple discriminating features with similar contribution to classification are combined into one pattern, which is used as the basic feature dimension. When new samples need to be learned, the changed discriminating features are moved between patterns with dynamic incremental training method for the real-time characteristics of Internet. Comparison experiments using static and dynamic training sets respectively show that DVSM outperforms the traditional model significantly in Internet News Real Time Categorization.