全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
-  2019 

Efficient Active Learning for Electronic Medical Record De-identification

Full-Text   Cite this paper   Add to My Lib

Abstract:

Electronic medical records are often de-identified before disseminated for secondary uses. However, unstructured natural language records are challenging to de-identify while utilizing a considerable amount of expensive human annotation. In this investigation, we incorporate active learning into the de-identification workflow to reduce annotation requirements. We apply this approach to a real clinical trials dataset and a publicly available i2b2 dataset to illustrate that, when the machine learning de-identification system can actively request information to help create a better model from beyond the system (e.g., a knowledgeable human assistant), less training data will be needed to maintain or improve the performance of trained models in comparison to the typical passive learning framework. Specifically, with a batch size of 10 documents, it requires only 40 documents for an active learning approach to reach an F-measure of 0.9, while passive learning needs at least 25% more data for training a comparable model

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133