Reconciling malware labeling discrepancy via consensus learning

Ting Wang; Xin Hu; Shicong Meng; Reiner Sailer

doi:10.1109/ICDEW.2014.6818308

ICDEW 2014

Conference paper

31 Mar 2014

Reconciling malware labeling discrepancy via consensus learning

View publication

Abstract

Anti-virus systems developed by different vendors often demonstrate strong discrepancy in the labels they assign to given malware, which significantly hinders threat intelligence sharing. The key challenge of addressing this discrepancy stems from the difficulty of re-standardizing already-in-use systems. In this paper we explore a non-intrusive alternative. We propose to leverage the correlation between the malware labels of different anti-virus systems to create a 'consensus' classification system, through which different systems can share information without modifying their own labeling conventions. To this end, we present a novel classification integration framework Latin which exploits the correspondence between participating anti-virus systems as reflected in heterogeneous information at instance-instance, instance-class, and class-class levels. We provide results from extensive experimental studies using real datasets and concrete use cases to verify the efficacy of Latin in reconciling the malware labeling discrepancy. © 2014 IEEE.

Conference paper