|
计算机应用 2008
Resolving combinational ambiguity in Chinese word segmentation based on rule mining and Naive Bayes method
|
Abstract:
Combinational ambiguity is one of the most difficult problems for Chinese word segmentation. After in-depth analysis of the other algorithms in literature, the paper proposed a new segmentation algorithm. The algorithm automatically mined word collocation rules and grammar rules from training corpus, and then made integrated decisions to resolve combinational ambiguity based on the mined rules and Naive Bayes method. Extensive experiments show that the proposed algorithm obtains an accuracy increase of 8% against the related works.