全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A Novel Approach to Detect the Near Duplicate by Refining Provenance Matrix

Keywords: near-duplicates , Provenance , distrusted , provenance matrix , trustworthiness

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this paper, the provenance matrix is refined to get more accuracy and efficiency in detecting near-duplicates by adding two more factors ‘How’ and ‘Why’ , as the performance of the web search depends on the search results having information without duplicates or redundancy . More redundancy leads to more time consume and more storage, that’s why search engines try to avoid indexing of duplicates documents. Provenance model combines both the content-based and trust-based factors for classifying near-duplicates or original documents, as now a days, many of near-duplicates are from the distrusted websites

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133