全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于Bootstrapping的因特网流量分类方法

DOI: 10.13190/j.jbupt.2014.05.014

Keywords: 半监督学习,类别不平衡,Bootstrapping,Internet流量分类

Full-Text   Cite this paper   Add to My Lib

Abstract:

针对因特网流量分类面临的流量类别标记瓶颈和类别样本数分布不平衡,提出基于Bootstrapping的流量分类方法,使用少量有标记样本训练初始分类器,迭代利用无标记样本扩展样本集并更新分类器.在构建扩展样本集过程中,将无标记样本在某后验概率分布下的正确分类行为视为一个概率事件,建立新的置信度计算方法,以减少扩展样本集中的噪声样本;基于概率近似正确学习理论建立启发式规则,注重选择小类样本加入扩展样本集,缓解类别样本数分布的不平衡.实验结果表明,与初始分类器相比,基于Bootstrapping的流量分类器总体分类准确率可提高9.46%;与现有半监督学习方法相比,小类分类准确率提高2.22%.

References

[1]  林平, 余循宜, 刘芳, 等. 基于流统计特性的网络流量分类算法[J]. 北京邮电大学学报, 2008, 31(2): 15-19. Lin Ping, Yu Xunyi, Liu Fang, et al. A network traffic classification algorithm based on flow statistical characteristics[J]. Journal of Beijing University of Posts and Telecommunications, 2008, 31(2): 15-19.
[2]  He Haitao, Luo Xiaonan, Ma Feiteng, et al. Network traffic classification based on ensemble learning and co-training[J]. Science in China Series F: Information Sciences, 2009, 52(2): 340-341.
[3]  Li Ming, Zhou Zhihua. Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples[J]. IEEE Transactions on System, 2007, 19(11): 1479-1493.
[4]  Li Xiang, Qi Feng, Xu Dan, et al. An Internet traffic classification method based on semi-supervised support vector machine[C]//Proceedings of IEEE International Conference on Communications, Cape Town. South Africa: IEEE Press, 2011: 1-5.
[5]  Breiman L. Bagging predictors[J]. Machine Learning, 1996, 24(2): 123-140.
[6]  Zhou Zhihua, Li Ming. Tri-training: exploiting unlabeled data using three classifiers[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1530-1531.
[7]  Li Wei, Cannini M, Moore A W, et al. Efficient application identification and the temporal and spatial stability of classification schema[J]. Computer Networks, 2009, 53(6): 790-809.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133