全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

An EMD-Based Metric for Document Semantic Similarity
一种基于EMD的文档语义相似性度量

Keywords: EMD(Earth Mover's Distance)
信息检索
,度量,文档相似性,匹配,语义距离

Full-Text   Cite this paper   Add to My Lib

Abstract:

Aiming at the conflicts between EMD(Earth Mover’s Distance)-based measure for document semantic similarity and metric axioms, which prevent EMD from being widely applied in the information retrieval and data mining, a novel EMD-based metric for document semantic similarity named Mdss_EMD is presented. Firstly, based on the analysis of drawbacks of EMD and its existing modifications, the concepts of document width and virtual term are proposed. Subsequently, by adding virtual term to initial document vector, the approach aligns the total weights of document vectors, so that all of metric axioms are satisfied. Finally, in order to improve the applicability and processing speed of the metric, the similarity distance of virtual term is designed to be elastic and EMD algorithm is also simplified. The proposed approach extends EMD to metric space, and substantially improves EMD on indexing and accuracy. The experimental results demonstrate that Mdss_EMD outperforms the original EMD and other similar measures in general.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133