A family of joint sparse PCA algorithms for anomaly localization in network data streams
Abstract
Determining anomalies in data streams that are collected and transformed from various types of networks has recently attracted significant research interest. Principal component analysis (PCA) has been extensively applied to detecting anomalies in network data streams. However, none of existing PCA-based approaches addresses the problem of identifying the sources that contribute most to the observed anomaly, or anomaly localization. In this paper, we propose novel sparse PCA methods to perform anomaly detection and localization for network data streams. Our key observation is that we can localize anomalies by identifying a sparse low-dimensional space that captures the abnormal events in data streams. To better capture the sources of anomalies, we incorporate the structure information of the network stream data in our anomaly localization framework. Furthermore, we extend our joint sparse PCA framework with multidimensional Karhunen Loève Expansion that considers both spatial and temporal domains of data streams to stabilize localization performance. We have performed comprehensive experimental studies of the proposed methods and have compared our methods with the state-of-the-art using three real-world data sets from different application domains. Our experimental studies demonstrate the utility of the proposed methods. © 2013 IEEE.