AUTHORS: Li Chen, Ravi Sahita – Intel Corpration, Jugal Parikh, Marc Marino – Microsoft Corporation

Our joint team of researchers from Intel Labs and Microsoft Threat Intelligence studied how to apply deep transfer learning from computer vision to static malware classification. Our STAMINA approach (STAtic Malware-as-Image Network Analysis) achieved high accuracy and low false positives, ultimately avoiding time-consuming manual feature engineering.

Malware refers to different types of malicious software, including viruses, ransomware and spyware. Cyberattackers develop malicious code to cause extensive damage to data and systems, or to gain unauthorized access to a network. Malware poses risks to organizations and individuals, including impaired usability, data loss, intellectual property theft, and monetary loss. It can even put human life at risk, according to a 2018 report by Microsoft Security Intelligence.

Choosing a malware detection approach

Classical malware detection approaches extract the binary signatures or fingerprints of the malware. However, the rapid and exponential increase in signatures makes binary signature matching less straightforward. Other detection approaches include static and dynamic analysis, both of which have advantages and disadvantages. Static analysis disassembles the code, but its performance can suffer from code obfuscation. Dynamic analysis, while able to unpack the code, can be time consuming.

In our paper titled "STAMINA: Scalable deep learning approach for malware classification," we used a static analysis approach to malware classification. Static analysis is a quick and straightforward way to detect malware without executing the application or monitoring the run time behavior. We defined malware as a combination of known malware (previously classified), potentially unwanted applications (PUAs), and unknown binaries (with no known provenance or history).

We considered malware classification as a computer vision problem. Using a transfer learning scheme, we borrowed knowledge from natural images or objects and applied it to static malware detection. This accelerated the training time of deep neural networks while still maintaining high classification performance. We demonstrated the effectiveness of this approach on a real-world user dataset, showing that transfer learning and computer vision for malware classification can achieve highly desirable classification performance. 

Our method consisted of three main steps: reshaping, resizing and replicating as preprocessing, training a deep neural network via transfer learning, and evaluation. Resizing as a preprocessing step does not negatively impact the classification result, since our system trains a very deep neural network to extract the deep-represented features. As seen in the experimental results, our system can outperform many other classifiers and results from prior-art. Furthermore, for malware from the same family, resizing still results in similar patterns.

Building on Intel Labs enhanced malware detection

Traditionally, manual feature engineering has been a key ingredient in a successful machine learning malware detector. While this time-consuming manual process puts heavy emphasis on excellent feature construction, an automated method would be more effective.

A 2018 Intel Labs study titled “Deep transfer learning for static malware classification” proposed using an enhanced malware detection framework with deep transfer learning to train directly on malware images. Based on the visual inspection of the malware binaries represented as grey-scale images, researchers observed that malware from the same family share structural similarities, and malware from different families show distinct structural or textural information.

This visual inspection provided the foundation for treating the malware classification problem as a vision classification task. Through our current collaboration, Intel Labs and Microsoft Threat Intelligence researchers sought to establish the practical value of this image-based transfer learning approach for static malware classification, based on a real-world data set.


STAMINA essentially consists of three steps: preprocessing image conversion, transfer learning, and evaluation. The preprocessing image conversion step directly converts the raw binary into two-dimensional images with minimal feature engineering.

Figure 1: First three steps of the STAMINA method.

The transfer learning step is done on the malware and benign images. Transfer learning has been heavily employed in computer vision. The idea of transfer learning is to borrow the knowledge learned from a model used in one domain and apply it to another targeted domain. Typically, practitioners take a pre-trained model from a type of image dataset, freeze a portion of the layers, and fine-tune the last few layers on the newly obtained dataset.

The advantages of transfer learning include accelerating training time, reducing parameters and architecture search for deep neural networks, and maintaining high classification performance especially on a relatively smaller-sized dataset. This analysis was done on a Microsoft dataset of 2.2 million hashes of malware binaries and 10 columns of data information.

Figure 2: We retrained the last fully connected layer and softmax.

The last step of STAMINA is evaluation. We considered accuracy, false positive rate, precision, recall, F1 score, and area under the receiver operating curve (ROC) as evaluation. Particularly per feedback from malware analysis practitioners, we also reported recall at 0.1% – 10% false positive rate via ROC.


As malware variants continue to grow, traditional signature matching techniques cannot keep up. We looked to applying deep-learning techniques to avoid costly feature engineering and used machine learning techniques to learn and build classification systems that can effectively identify malware program binaries. We explored a novel image-based technique on x86 program binaries, which resulted in 99.07% accuracy with 2.58% false positive rate.

Our study indicates the pros and cons between sample-based and meta data-based methods. The major advantages are that we can go in-depth into the samples and extract textural information, so all the characteristics of the malware files are captured during training. However, for bigger size applications, STAMINA becomes less effective due to software not being able to convert billions of pixels into JPEG images and then resizing. In cases like this, meta-data-based methods show advantages over sample-based models.

For future work, we would like to evaluate hybrid models of using intermediate representations of the binaries and information extracted from binaries with deep learning approaches – these datasets are expected to be bigger but may provide higher accuracy. We also will continue to explore platform acceleration optimizations for our deep learning models so we can deploy such detection techniques with minimal power and performance impact to the end-user.

Notices and Disclaimers

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.

Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit

Performance results are based on internal testing and may not reflect all publicly available updates. No product or component can be absolutely secure.

Your costs and results may vary.

Intel technologies may require enabled hardware, software or service activation.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.