全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
电子学报  2015 

基于用户兴趣集的在线垃圾邮件快速识别新方法

DOI: 10.3969/j.issn.0372-2112.2015.10.013, PP. 1963-1970

Keywords: 垃圾邮件,用户兴趣集,支持向量机,主动学习,在线应用

Full-Text   Cite this paper   Add to My Lib

Abstract:

为在不显著降低垃圾邮件识别精度的同时有效提高邮件识别速度,提出了一种在线垃圾邮件快速识别新方法.首先引入用户正、负兴趣集的概念,结合用户兴趣集及支持向量机对邮件进行分类;然后根据主动学习理论,结合训练集样本密度及改进角度差异方法寻找分类最不确定的样本并推荐给用户进行类别标注;最后将标注后样本及分类最确定性样本加入训练集,并使用样本价值评价新函数淘汰冗余样本以生成新的训练集.实验表明,本文方法的用户标注负担小,垃圾邮件识别精度高、速度快,具有较高的在线应用价值.

References

[1]  Liu W Y,Wang T.Online active multi-field learning for efficient email spam filtering[J].Knowledge and Information Systems,2012,33(1):117-136.
[2]  Bertini J R,Zhao L,Lopes A A.An incremental learning algorithm based on the K-associated graph for non-stationary data classification[J].Information Sciences,2013,246:52-68.
[3]  Costa J,Silva C,Antunes M,Ribeiro B.Customized crowds and active learning to improve classification[J].Expert System with Applications,2013,40(18):7212-7219.
[4]  Syed N A,Liu H,Huan S,et al.Handling concept drifts in incremental learning with support vector machines[A].Proceedings of the Workshop on Support Vector Machines at the International Joint Conference on Artificial Intelligence[C].Stockholm,Sweden,1999.317-321.
[5]  Wu C M,Wang X D,Bai D Y,et al.Fast incremental learning algorithm of SVM on KKT conditions[A].Sixth International Conference on Fuzzy Systems and Knowledge Discovery[C].Tianjin,China:IEEE Press,2009.551-554.
[6]  Amayri O,Bouguila N.A study of spam filtering using support vector machines[J].Artificial Intelligence Review,2010,34(1):73-108.
[7]  Tong S,Chang E.Support vector machine active learning for image retrieval[A].Proceedings of the 9th ACM International Conference on Multimedia[C].New York,USA:ACM,2001.107-118.
[8]  Hu L S,Lu S X,Wang X Z.A new and informative active learning approach for support vector machine[J].Information Sciences,2013,244:142-160.
[9]  Leng Y,Xu X Y,Qi G H.Combining active learning and semi-supervised learning to construct SVM classifier[J].Knowledge-Based Systems,2013,44(5):121-131.
[10]  陈荣,曹永锋,孙洪.基于主动学习和半监督学习的多类图像分类[J].自动化学报,2011,37(8):954-962. Chen Rong,Cao Yong-feng,Sun Hong.Multi-class image classification with active learning and semi-supervised learning[J].Acta Automatica Sinica,2011,37(8):954-962.(in Chinese)
[11]  Ali Haji N,Ibrahim N S.Porter stemming algorithm for semantic checking[A].ICCIT 2012[C].Chittagong University,Chittagong,2012.253-258.
[12]  Yang J M,Liu Y N,Zhu X D,et al.A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization[J].Information Processing & Management,2012,48 (4):741-754.
[13]  Platt J.Sequential minimal optimization:a fast algorithm for training support vector machines[R].Microsoft Research,1998-04-21.
[14]  Cormack G V.TREC 2007 spam track overview[A].Proceedings of the 16th Text Retrieval Conference[C].National Institute of Standards and Technology,Special Publication:2007.500.
[15]  丁文军,薛安荣.基于SVM 的Web 文本快速增量分类算法[J].计算机应用研究,2012,29(4):1275-1278. Ding Wen-jun,Xue An-rong.Fast incremental learning SVM for web text classification[J].Application Research of Computers,2012,29(4):1275-1278.(in Chinese)

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133