全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Approximately duplicated records examining method and its application in ETL of data warehouse
数据仓库ETL中相似重复记录的检测方法及应用

Keywords: ETL
位置编码
,数据仓库,相似重复记录

Full-Text   Cite this paper   Add to My Lib

Abstract:

Examining and eliminating approximately duplicated records is one of main problems needed to solve for data cleaning and improving data quality. The position-coding technology to ETL of data warehouse was introduced,a novel examining algorithm named Position-Coding Method(PCM) of approximately duplicated records was presented.The algorithm was applied to Chinese character set, as well as Western character set. Experiment comparison with the previous work indicates that the method is effective.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133