Malware Classification with LSTM and GRU Language Models and a Character-Level CNN

Ben Athiwaratkun; Jack W. Stokes

Malware Classification with LSTM and GRU Language Models and a Character-Level CNN

Ben Athiwaratkun ,
Jack W. Stokes

Proceedings IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP) | March 2017

Download BibTex

Malicious software, or malware, continues to be a problem for computer users, corporations, and governments. Previous research [Pascanu2015] has explored training file-based, malware classifiers using a two-stage approach. In the first stage, a malware language model is used to learn the feature representation which is then input to a second stage malware classifier. In Pascanu et al. [Pascanu2015], the language model is either a standard recurrent neural network (RNN) or an echo state network (ESN). In this work, we propose several new malware classification architectures which include a long short-term memory (LSTM) language model and a gated recurrent unit (GRU) language model. We also propose using an attention mechanism similar to [Bahdanau2015] from the machine translation literature, in addition to temporal max pooling used in [Pascanu2015] as an alternative way to construct the file representation from neural features. % to extend the recurrent language model’s memory. Finally, we propose a new single-stage malware classifier based on a character-level convolutional neural network (CNN). Results show that the LSTM with temporal max pooling and logistic regression offers a 31.3% improvement in the true positive rate compared to the best system in [Pascanu2015] at a false positive rate of 1%.