全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A semi-structured document model for text mining
A Semi-Structured Document Model for Text Mining

Keywords: semi-structured document,XML,text mining,vector space model,structured link vector model
HTML语言
,XML语言,半结构文件模型,版本开采,结构信息

Full-Text   Cite this paper   Add to My Lib

Abstract:

A semi-structured document has more structured information compared to an ordinary document, and the relation among semi-structured documents can be fully utilized. In order to take advantage of the structure and link information in a semi-structured document for better mining, a structured link vector model (SLVM) is presented in this paper, where a vector represents a document, and vectors' elements are determined by terms, document structure and neighboring documents. Text mining based on SLVM is described in the procedure of K-means for briefness and clarity: calculating document similarity and calculating cluster center. The clustering based on SLVM performs significantly better than that based on a conventional vector space model in the experiments, and its F value increases from 0.65-0.73 to 0.82-0.86.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133