|
计算机应用研究 2010
Chinese organization names recognition with Tri-training learning
|
Abstract:
In view of the data scarcity problem in for Chinese organization names recognition, this paper presented a co-training style method for Organization Names Recognition. And proposed a novel selection method for Tri-training learning, using three classifiers: CRFs, SVMs and MBL. In Tri-training process, selected new newly labeled samples based on the selection model maximizing training utility, and computed the agreement according to the agreement scoring function. Experiments on large-scale corpus show that the proposed Tri-training learning approach can more effectively and stably exploit unlabeled data to improve the generalization ability than co-training and the standard Tri-training.