Memory forensics is a young but fast-growing area of research and a promising one for the field of computer forensics. The learned model is proposed to reside in an isolated core with strict communication restrictions to achieve incorruptibility as well as efficiency, therefore providing a probabilistic memory-level view of the system that is consistent with the user-level view. The lower level memory blocks are constructed using primary block sequences of varying sizes that are fed as input into Long-Short Term Memory (LSTM) models. Four configurations of the LSTM model are explored by adding bi- directionality as well as attention. Assembly level data from 50 Windows portable executable (PE) files are extracted, and basic blocks are constructed using the IDA Disassembler toolkit. The results show that longer primary block sequences result in richer LSTM hidden layer representations. The hidden states are fed as features into Max pooling layers or Attention layers, depending on the configuration being tested, and the final classification is performed using Logistic Regression with a single hidden layer. The bidirectional LSTM with Attention proved to be the best model, used on basic block sequences of size 29. The differences between the model’s ROC curves indicate a strong reliance on the lower level, instructional features, as opposed to metadata or string features.
References
[1]
Baldwin, R. (2018) If You’re Still Using a Fax Machine for “Security” Think Again. Engadget. http://www.engadget.com/2018/08/20/fax-machine-hack
[2]
Stüttgen, J. (2015) Acquisition and Analysis of Compromised Firmware Using Memory Forensics. Digital Investigation, 12, S50-S60. https://doi.org/10.1016/j.diin.2015.01.010
[3]
Stüttgen, J. and Cohen, M. (2013) Anti-Forensic Resilient Memory Acquisition. Digital Investigation, 10, 105-115. https://doi.org/10.1016/j.diin.2013.06.012
[4]
Sukhbaatar, S., Weston, J. and Fergus, R. (2015) End-to-End Memory Networks. In: Advances in Neural Information Processing Systems, The MIT Press, Cambridge, 2431-2439.
[5]
Yang, X., Lo, D., Xia, X., Zhang, Y. and Sun, J. (2015) Deep Learning for Just-in-Time Defect Prediction. IEEE International Conference on Software Quality, Reliability and Security, Vancouver, 3-5 August 2015, 17-26. https://doi.org/10.1109/QRS.2015.14
[6]
Shin, E., Song, D. and Moazzezi, R. (2015) Recognizing Functions in Binaries with Neural Networks. Usenix Conference on Security Symposium, 611-626.
[7]
Zhang, Z. (2001) HIDE: A Hierarchical Network Intrusion Detection System Using Statistical Preprocessing and Neural Network Classification. Proc. IEEE Workshop on Information Assurance and Security, West Point, 5-6 June 2001, 85-90.
[8]
Krueger, D., Maharaj, T., Kramar, J., Pezeshki, M., Ballas, N., Ke, N.R., Goyal, A., Bengio, Y., Larochelle, H. and Courville, A. (2016) Zoneout: Regularizing RNNS by Randomly Preserving Hidden Activations.
[9]
Sutskever, I., Martens, J., Dahl, G. and Hinton, G. (2013) On the Importance of Initialization and Momentum in Deep Learning. International Conference on Machine Learning, 1139-1147.
[10]
“Invincea.” Invincea: A Sophos Company. http://www.sophos.com/en-us/lp/invincea.aspx
[11]
Raff, E., Zak, R., Cox, R., Sylvester, J., Yacci, P., Ward, R., Tracy, A., McLean, M. and Nicholas, C. (2016) An Investigation of Byte n-Gram Features for Malware Classification. Journal of Computer Virology and Hacking Techniques, 14, 1-20. https://doi.org/10.1007/s11416-016-0283-1
[12]
Raff, E., Fleming, W., Zak, R., Anderson, H., Finlayson, B. and Nicholas, C. (2019) KiloGrams: Very Large N-Grams for Malware Classification.
[13]
Qiao, R. and Sekar, R. (2017) Function Interface Analysis: A Principled Approach for Function Recognition in COTS Binaries. 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Denver, 26-29 June 2017. https://doi.org/10.1109/DSN.2017.29
[14]
Hou, S., Saas, A., Chen, L., Ye, Y. and Bourlai, T. (2017) Deep Neural Networks for Automatic Android Malware Detection. Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 ASONAM 17, Sydney, 31 July-3 August 2017, 803-810. https://doi.org/10.1145/3110025.3116211
[15]
Xiao, F., Lin, Z., Sun, Y. and Ma, Y. (2019) Malware Detection Based on Deep Learning of Behavior Graphs. Mathematical Problems in Engineering, 2019, Article ID: 819539. https://doi.org/10.1155/2019/8195395
[16]
Zhang, D.K., et al. (2018) MetaGraph2Vec: Complex Semantic Path Augmented Heterogeneous Network Embedding. In: Advances in Knowledge Discovery and Data Mining, Springer, Berlin, 196-208. https://doi.org/10.1007/978-3-319-93037-4_16
[17]
Faraz, A., Haider, H., Shafiq, M.Z. and Farooq, M. (2009) Using Spatio-Temporal Information in API Calls with Machine Learning Algorithms for Malware Detection. Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence, Chicago, 9 November 2009, 55-62. https://doi.org/10.1145/1654988.1655003
[18]
Pascanu, R., Stokes, J.W., Sanossian, H., Marinescu, M. and Thomas, A. (2015) Malware Classification with Recurrent Networks. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, 19-24 April 2015, 1916-1920. https://doi.org/10.1109/ICASSP.2015.7178304
[19]
Athiwaratkun, B. and Stokes, J.W. (2017) Malware Classification with LSTM and GRU Language Models and a Character-Level CNN. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, 5-9 March 2017. https://doi.org/10.1109/ICASSP.2017.7952603
[20]
Zhang, J. and Chen, W. (2019) DeepCheck, a Non-Intrusive Control-Flow Integrity Checking Based on Deep Learning. https://arxiv.org/pdf/1905.01858.pdf
[21]
Song, W., Yin, H., Liu, C. and Song, D. (2018) DeepMem: Learning Graph Neural Network Models for Fast and Robust Memory Forensic Analysis. ACM SIGSAC Conference on Computer and Communications Security, Toronto, January 2018, 606-618. https://doi.org/10.1145/3243734.3243813
Keras Development Team (2016) Keras: Deep Learning Library for Theano and Tensorflow. https://keras.io/
[24]
Bahdanau, D., Cho, K.H. and Bengio, Y. (2014) Neural Machine Translation by Jointly Learning to Align and Translate. ICLR 2015. https://arxiv.org/pdf/1409.0473.pdf
[25]
Hochreither, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
[26]
Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H. and Jin, Z. (2016) Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Path. https://doi.org/10.18653/v1/D15-1206 https://arxiv.org/pdf/1508.03720
[27]
Graves, A. (2013) Generating with Recurrent Neural Networks. https://arxiv.org/pdf/1308.0850.pdf