%0 Journal Article %T 语义知识库构建中的异常数据发现<br>Discovering Abnormal Data in RDF Knowledge Base %A 贺彬彬 %A 邹磊 %A 赵东岩 %J 北京大学学报(自然科学版) %D 2015 %X 摘要 为了提高RDF知识库的数据质量, 提出RDF图数据的异常检测及其自动修复的方法。首先, 原创性地定义了基于图的条件函数依赖(GCFD), 能够将属性值和语义结构的依赖关系统一表示; 然后, 提出有效的算法框架以及优化策略, 挖掘RDF数据中的GCFD, 并给出异常数据的自动修复流程; 最后, 在真实的数据集上, 通过大量实验确认解决方案的可行性和优越性。<br>Abstract To effectively improve the data quality of RDF knowledge base, a solution is proposed about abnoraml data discovery and errouneous data repair in RDF graphs. Firstly, the authors innovatively define graph-based conditional functional dependency (GCFD) that can represent the attribute value and semantic structure dependencies of RDF data in a uniform manner. Then, an efficient framework and some novel pruning rules are proposed to discover GCFDs, and the workflow of auto-repairing errorneous data are given. Extensive experiments on several real-life RDF repositories confirm the superiority of proposed solution. %K RDF数据质量 %K 基于图的条件函数依赖 %K 条件函数依赖 %K 函数依赖 %K RDF数据质量 %K 基于图的条件函数依赖 %K 条件函数依赖 %K 函数依赖< %K br> %K RDF data quality %K graph-based conditional functional dependencies (GCFD) %K conditional functional dependency %K functional dependency %K RDF data quality %K graph-based conditional functional dependencies (GCFD) %K conditional functional dependency %K functional dependency %U http://xbna.pku.edu.cn/CN/abstract/abstract2586.shtml