|
现代图书情报技术 2008
The Research of Character-Position-Based Chinese Word Segmentation
|
Abstract:
This paper analyses the actuality and introduces several different representative approaches of Chinese word segmentation,then brings out a character-position-based segmentation method which takes the Chinese character as the least unit.It indicates the probability distribution of a word through the probability distribution of Chinese character,so it plays much better than other approaches in unknown word recognition.This idea takes a machine-learning method called maximum entropy for implementation and two experiments for comparing and analyzing the results.