Cluster analysis related to
computational linguistics seldom concerned with Pragmatics level. Features of
corpus on Pragmatics level related to specific situations, including
backgrounds, titles and habits. To improve the accuracy of clustering for conversations
collected from international students in Tsinghua University, it required
contextual features. Here, we collected four-hundred conversations as a corpus
and built it to Vector Space Model. With the Oxford-Duden Dictionary and other
methods we modified the model and concluded into three groups. We testified our
hypothesis through self-organizing map neural network. The result suggested
that the modified model had a better outcome.
Ji, H., Luo, Z.S., Wang, M. and Gao, X.Y. (2002) Summarizing Based on Concept Counting and Hierarchy Analysis. The Natural Language Processing and Knowledge Engineering (NLPKE) Mini Symposium of the 2002 IEEE International Conference on Systems, Man and Cybernetics (SMC2002).
Liao, S.S. and Jiang, M.H. (2005) An Improved Method of Feature Selection Based on Concept Attributes in Text Classification. Advances in Natural Computation, Lecture Notes in Computer Science, 3610, 1140-1149.