|
计算机应用研究 2005
Studies on the Consistency of Word-segmented Chinese Corpus
|
Abstract:
This paper presents a new method to check up and proofread the consistency of word-segmented on the bases of analysis of inconsistencies of large-scale word-segmented Chinese corpus, The method picks up syntax and semantic collocation among the words, works out the result using SVM to judge the test sequences, and assures the correctness of the segment on large-scale corpus further.