全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
-  2017 

基于层次聚类的虚假用户检测
Detecting of fake accounts with hierarchical clustering

DOI: 10.16511/j.cnki.qhdxxb.2017.26.029

Keywords: 数据安全,虚假账户,机器学习,层次聚类,
data security
,fake accounts,machine learning,hierarchical clustering

Full-Text   Cite this paper   Add to My Lib

Abstract:

互联网上充斥着大量恶意用户,而互联网服务提供商通常有海量的注册用户,使得系统难以从中发现虚假账户。针对海量注册数据中,恶意用户批量注册的虚假账户通常具有相似性的特点。该文提出海量数据中定位虚假账户的系统模型,利用用户名字符串组成模式对海量数据进行预分类,进而对每个分类中元素计算字符串相似度,即计算字符串Levenshtein距离。设置合适的阈值,进行层次聚类分析,从而定位藏匿在海量注册数据中的成组的虚假账户。实验结果表明:该系统模型有效,与现有的模型相比,该系统对数据维度、数据特性依赖较小。
Abstract:Since there are many malicious users on the Internet, popular online websites sometimes have millions of registered users. The system cannot easily distinguish between fake accounts and legitimate users. Fake accounts registered by a single malicious user often have similar profiles. This paper presents a new framework to find fake accounts in large numbers of users. The framework uses username string patterns to classify the original data and then calculates the similarity as measured by the Levenshtein distance between any two elements in one category. Hierarchical clustering with a proper threshold then finds groups of fake accounts hidden in the large amount of registration data. Tests demonstrate the effectiveness of this framework which algorithm relies less on data dimensions and features than other algorithms.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133