%0 Journal Article %T 在线医疗文本中的实体识别研究<br>Entity Recognition Research in Online Medical Texts %A 苏娅 %A 刘杰 %A 黄亚楼 %J 北京大学学报(自然科学版) %D 2016 %R 10.13209/j.0479-8023.2016.020 %X 摘要 针对在线医疗文本, 设计考虑医疗领域特性的识别特征, 并在自建数据集上进行实体识别实验。针对常见的5 类疾病: 胃炎、肺癌、哮喘、高血压和糖尿病, 采用近年来较先进的机器学习模型条件随机场, 进行训练和测试, 抽取目标实体包括疾病、症状、药品、治疗方法和检查5类。通过采用逐一添加特征的实验方式, 验证所提特征的有效性, 取得总体上81.26%的准确率和60.18%的召回率, 随后对识别特征给出进一步分析。<br>Abstract The authors design recognition features with the consideration of medical field characteristic for the online medical text, and the experiment of the entity recognition is carried out on the self-built data set. Concerned about five common diseases: gastritis, lung cancer, asthma, hypertension and diabetes. In the experiment, an advanced machine learning model Conditional Random Field is used for training and testing. The target entities include five kinds: disease, symptoms, drugs, treatment methods and check. The effectiveness of the proposed features is verified by using the experimental method, and the accuracy of the total 81.26% is obtained and the recall rate is 60.18%. Subsequently, the further analysis is given for the recognition features. %K 实体识别 %K 数据挖掘 %K 条件随机场 %K 医疗信息< %K br> %K named entity recognition %K data mining %K conditional random field %K medical information %U http://xbna.pku.edu.cn/CN/abstract/abstract2889.shtml