Evaluating Feature Robustness for Windows Malware Family Classification
Abstract
Machine learning approaches to classify malware by family save analysts valuable time during incident response. A key challenge for these approaches is selecting features that are robust against concept drift, which describes the change in malware over time. In this paper, we evaluate a dynamic feature set based on Windows handles (e.g., files, registry keys) for malware family classification. Specifically, we examine the features' vulnerabilities and evaluate their robustness against concept drift. We curated a novel dataset that simulates the manipulations that attackers may invoke on malware samples. We demonstrate improved robustness to concept drift over traditional API call-based features by training machine learning classifiers on malware collected in the wild, and testing the classifiers against samples that underwent manipulations. Further, we investigate time decay due to concept drift using temporally consistent evaluations that do not assume access to newer information. The evaluation shows that our features are robust against malware obfuscation. Furthermore, we empirically demonstrate how malware labeling conventions (malware type or family) can affect results, and make recommendations for dataset construction.