Predicting malware attributes from cybersecurity texts

Arpita Roy; Youngja Park; Shimei Pan

NAACL 2019

Conference paper

02 Jun 2019

Predicting malware attributes from cybersecurity texts

Abstract

Text analytics is a useful tool for studying malware behavior and tracking emerging threats. The task of automated malware attribute identification based on cybersecurity texts is very challenging due to a large number of malware attribute labels and a small number of training instances. In this paper, we propose a novel feature learning method to leverage diverse knowledge sources such as small amount of human annotations, unlabeled text and specifications about malware attribute labels. Our evaluation has demonstrated the effectiveness of our method over the state-of-the-art malware attribute prediction systems.

Conference paper