Chan P K,Stolfo S J.Toward scalable learning with nonuniform class and cost distributions:A case study in credit card fraud detection[A].Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining[C].New York:AAAI,1998.164-168.
[2]
Phua C,Alahakoon D,Lee V.Minority report in fraud detection:Classification of skewed data[J].SIGKDD Explore,2004,6(1):50-59.
[3]
Lewis D,Gale W.A sequential algorithm for training text classifiers[A].Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C].Dublin:ACM,1994.3-12.
[4]
Turney P D.Learning algorithms for keyphrase extraction[J].Information Retrieval,2000,2(4):303-336.
[5]
Ling C X,Li C.Data mining for direct marketing:Problems and solutions[A].Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining[C].New York:AAAI,1998.73-79.
[6]
Japkowicz N.The class imbalance problem:Significance and strategies[A].Proceedings of the 2000 International Conference on Artificial Intelligence:Special Track on Inductive Learning[C].Las Vegas:AAAI,2000.111-117.
[7]
Liu Xu-Ying,Wu Jian-xin,Zhou Zhi-Hua.Exploratory undersampling for class-imbalance learning[J].IEEE Transactions on Systems,Man and Cybernetics,2009,39(2):539-550.
[8]
Zhou Z H,Liu X Y.Training cost-sensitive neural networks with methods addressing the class imbalance problem[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(1):63-77.
[9]
Chawla N V,Bowyer K W,Hall L O,Kegelmeyer W P.SMOTE-synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2002,16(1):321-357.
[10]
杨智明,乔立岩,彭喜元.基于改进SMOTE的不平衡数据挖掘方法研究[J].电子学报,2007,35(12A):22-26. Yang Zhi-Ming,Qiao Li-Yan,Peng Xi-Yuan.Research ondatamining method for imbalanced dataset based on improved SMOTE[J].Acta Electronica Sinica,2007,35(12A):22-26.(in Chinese)
[11]
Han H,Wang W Y,Mao B H.Borderline-SMOTE:A new over-sampling method in imbalanced data sets learning[A].Proceedings of the 2005 international conference on Advances in Intelligent Computing[C].Berlin,Heidelberg:Springer-Verlag,2005,3644.878-887.
[12]
曾志强,吴群,廖备水,高济.一种基于核SMOTE的非平衡数据集分类方法[J].电子学报,2009,37(11):2489-2495. Zeng Zhi-Qiang,Wu Qun,Liao Bei-Shui,Gao Ji.A classfication method for imbalance data set based on kernel SMOTE[J].Acta Electronica Sinica,2009,37(11):2489-2495.(in Chinese)
[13]
李正欣,赵林度.基于SMOTEBoost的非均衡数据集SVM分类器[J].系统工程,2008,26(5):116-119. Li Zheng-Xin,Zhao Lin-Du.A SVM classifier for imbalanced datasets based on SMOTEBoost[J].Systems Engineering,2008,26(5):116-119.(in Chinese)
[14]
毕华,梁洪力,王珏.重采样方法与机器学习[J].计算机学报,2009,32(5):862-877. Bi Hua,Liang Hong-Li,Wang Yu.Resamplingmethod and machine learning[J].Chinese Journal of Computers,2009,32(5):862-877.(in Chinese)
[15]
Fan X N,Tang K,Weise T.Margin-based over-sampling method for learning from imbalanced datasets[A].Proceedings of the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining[C].Berlin:Springer,2011.24-27.
[16]
欧阳震诤,罗建书,胡东敏,吴泉源.一种不平衡数据流集成分类模型[J].电子学报,2010,38(1):184-189. OUYANG Zhen-zheng,LUO Jian-shu,HU Dong-min,WU Quan-yuan.an ensemble classifier framework for mining imbalanced data streams[J].Acta Electronica Sinica,2010,38(1):184-189.(in Chinese)
[17]
周志华,陈世福.神经网络集成[J].计算机学报,2002,25(1):1-8. Zhou Zhi-Hua,Chen Shi-Fu.Neural network ensemble[J].Chinese Journal of Computers,2002,25(1):1-8.(in Chinese)
[18]
Zhou Z H,Jiang Y.MeV4dical diagnosis with C4.5 rule preceded by artificial neural network ensemble[J].IEEE Transactions on Information Technology in Biomedicine,2003,7(1):37-42.
Brodley C E,Friedl M A.Identifying mislabeled training data[J].Journal of Artificial Intelligence Research,1999,11(1):131-167.
[21]
Muhlenbach F,Lallich S,Zighed D.Identifying and handling mislabelled instances[J].Journal of Intelligent Information Systems,2004,22(1):89-109.
[22]
Gamberger D,Lavrac N,Dzeroski S.Noise elimination in inductive concept learning:A case study in medical diagnosis[A].Proceedings of the 7th International Workshop on Algorithmic Learning Theory[C].Berlin,Heidelberg:Springer-Verlag,1996,1160.199-212.
[23]
Fawcett T.ROC graphs:Notes and practical considerations for data mining researchers[R].USA:Technical Report HP Labs,2003.
[24]
Garcha V,Sanchez J S,Mollineda R A.On the use of surrounding neighbors for synthetic over-sampling of the minority class[A].Proceedings of 8th WSEAS International Conference on Simulation,Modeling and Optimization[C].Santander:WSEAS Press,2008.23-25.
[25]
He H,Bai Y,Garcia E A,Li S.ADASYN:Adaptive synthetic sampling approach for imbalanced learning[A].Proceedings of 2008 IEEE International Joint Conference on Neural Networks[C].Hong Kong:IEEE Press,2008.1322-1328.
[26]
Calleja J D L,Fuentes O.A distance-based over-sampling method for learning from imbalanced data sets[A].Proceedings of the 20th International Florida Artificial Intelligence Research Society Conference[C].Florida:AAAI Press,2007.634-635.
[27]
杨炳儒,谢永红,侯伟,周谆.基于复合金字塔模型的蛋白质二级结构预测系统的研究[J].科学通报,2009,54(21):3311-3319. Yang Bing-Ru,Xie Yong-Hong,Hou Wei,Zhou Zhun.A novel protein secondary structure prediction system based on compound pyramid model[J].Chinese Science Bulletin,2009,54(21):3311-3319.(in Chinese)
[28]
Yang B R,Hou W,Zhou Z,Quan HB.KAAPRO:An approach of protein secondary structure prediction based on KDD* in the compound pyramid prediction model[J].Expert Systems With Applications,2009,36(1):9000-9006.