全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
Polibits  2012 

String Distances for Near-duplicate Detection

Keywords: near-duplicate detection, string similarity measures, database, data mining.

Full-Text   Cite this paper   Add to My Lib

Abstract:

near-duplicate detection is important when dealing with large, noisy databases in data mining tasks. in this paper, we present the results of applying the rank distance and the smith-waterman distance, along with more popular string similarity measures such as the levenshtein distance, together with a disjoint set data structure, for the problem of near-duplicate detection.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133