This paper tests various scenarios of feature selection and feature reduction, with the objective of building a real-time anomaly-based intrusion detection system. These scenarios are evaluated on the realistic Kyoto 2006+ dataset. The influence of reducing the number of features on the classification performance and the execution time is measured for each scenario. The so-called HVS feature selection technique detailed in this paper reveals many advantages in terms of consistency, classification performance and execution time.
References
[1]
Song, J., Takakura, H., Okabe, Y., Eto, M., Inoue, D. and Nakao, K. (2011) Statistical Analysis of Honeypot Data and Building of Kyoto 2006+ Dataset for NIDS Evaluation. Proceedings of the 1st Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, Salzburg, 10-13 April 2011, 29-36.
http://dx.doi.org/10.1145/1978672.1978676
[2]
MIT Lincoln Lab., Information Systems Technology Group (1998) The 1998 Intrusion Detection Off-Line Evaluation Plan. http://www.ll.mit.edu/ideval/files/id98-eval-ll.txt
[3]
Abdi, H. and Williams, L.J. (2010) Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 433-459. http://dx.doi.org/10.1002/wics.101
[4]
Hornik, K., Stinchcombe, M. and White, H. (1989) Multilayer Feedforward Networks Are Universal Approximators. Neural Networks, 2, 359-366. http://dx.doi.org/10.1016/0893-6080(89)90020-8
[5]
Huang, G.B., Chen, Y.Q. and Babri, H.A. (2000) Classification Ability of Single Hidden Layer Feedforward Neural Networks. IEEE Transactions on Neural Networks, 11, 799-801. http://dx.doi.org/10.1109/72.846750
[6]
Wong, P.M., Gedeon, T.D. and Taggart, I.J. (1995) An Improved Technique in Porosity Prediction: A Neural Network Approach. IEEE Transactions on Geoscience and Remote Sensing, 33, 971-980. http://dx.doi.org/10.1109/36.406683
[7]
Yacoub, M. and Bennani, Y. (1997) HVS: A Heuristic for Variable Selection in Multilayer Artificial Neural Network Classifier. Intelligent Engineering Systems through Artificial Neural Networks, St. Louis, January 1997, 527-532.
[8]
Wold, H. (1975) Soft Modeling by Latent Variables: The Nonlinear Iterative Partial Least Squares Approach. Perspectives in Probability and Statistics, Papers in Honour of MS Bartlett, 520-540.
[9]
Haenlein, M. and Kaplan, A.M. (2004) A Beginner’s Guide to Partial Least Squares Analysis. Understanding Statistics, 3, 283-297. http://dx.doi.org/10.1207/s15328031us0304_4
[10]
Leray, P. and Gallinari, P. (1999) Feature Selection with Neural Networks. Behaviormetrika, 26, 145-166.
[11]
Kayacik, H.G., Zincir-Heywood, A.N. and Heywood, M.I. (2005) Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets. Proceedings of the 3rd Annual Conference on Privacy, Security and Trust, 12-14 October 2005, 85-89.
[12]
Arauujo, N., de Oliveira, R., Ferreira, E.-W., Shinoda, A.A. and Bhargava, B. (2010) Identifying Important Characteristics in the KDD99 Intrusion Detection Dataset by Feature Selection Using a Hybrid Approach. 2010 IEEE 17th International Conference on Telecommunications (ICT), Doha, 4-7 April 2010, 552-558.
http://dx.doi.org/10.1109/ICTEL.2010.5478852
[13]
Guo, Y., Wang, B., Zhao, X., Xie, X., Lin, L. and Zhou, Q. (2010) Feature Selection Based on Rough Set and Modified Genetic Algorithm for Intrusion Detection. 2010 5th International Conference on Computer Science and Education (ICCSE), Hefei, 24-27 August 2010, 1441-1446. http://dx.doi.org/10.1109/ICCSE.2010.5593765
[14]
Mi, A.Z. and Hai, L.P. (2010) A Clustering-Based Classifier Selection Method for Network Intrusion Detection. 2010 5th International Conference on Computer Science and Education (ICCSE), Hefei, 24-27 August 2010,1001-1004.
http://dx.doi.org/10.1109/ICCSE.2010.5593398
[15]
Nguyen, H.D. and Cheng, Q. (2011) An Efficient Feature Selection Method for Distributed Cyber Attack Detection and Classification. 2011 45th Annual Conference on Information Sciences and Systems (CISS), Baltimore, 23-25 March 2011, 1-6. http://dx.doi.org/10.1109/CISS.2011.5766239
[16]
Wang, J., Li, T.H. and Ren, R.R. (2010) A Real Time IDSs Based on Artificial Bee Colony-Support Vector Machine Algorithm. 2010 3rd International Workshop on Advanced Computational Intelligence (IWACI), Suzhou, 25-27 August 2010, 91-96.
[17]
Zhang, F.L. and Wang, D. (2013) An Effective Feature Selection Approach for Network Intrusion Detection. 2013 IEEE 8th International Conference on Networking, Architecture and Storage (NAS), Xi’an, 17-19 July 2013, 307-311.
http://dx.doi.org/10.1109/NAS.2013.49
[18]
Hota, H.S. and Shrivas, A.K. (2014) Data Mining Approach for Developing Various Models Based on Types of Attack and Feature Selection as Intrusion Detection Systems (IDS). In: Mohapatra, D.P. and Patnaik, S., Eds., Intelligent Computing, Networking, and Informatics, Springer India, New Delhi, 845-851.
http://dx.doi.org/10.1007/978-81-322-1665-0_85
[19]
Jackson, J.E. (2005) A User’s Guide to Principal Components, Volume 587. John Wiley & Sons, Hoboken.
[20]
Kim, S.B. and Rattakorn, P. (2011) Unsupervised Feature Selection Using Weighted Principal Components. Expert Systems with Applications, 38, 5704-5710. http://dx.doi.org/10.1016/j.eswa.2010.10.063