About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SDM 2021
Conference paper
Signature-based anomaly detection in networks
Abstract
The problem of outlier detection has been studied extensively in spatial and multi-dimensional databases. In the multi-dimensional case, the problem is much simpler because of the natural interpretability of the outliers in terms of distances. For example, in multi-dimensional data, the data points satisfy the triangle inequality. This can be considered a relaxed version of the transitivity property in terms of closeness of data points. Therefore, it is much easier to find data points which are situated far away from the majority of other points. This is however not the case in general networks in which closeness does not show such transitivity. In fact, some nodes can be defined as outliers when they are either close to an excessively large number of nodes or far away from a large number of nodes. Therefore, traditional measures of distances or density sparsity cannot be used to accurately model the concept of outliers in massive networks. We define two kinds of signatures in massive networks: distance set signatures and distance frequency signatures. We use these signatures to model the outlier detection problem effectively in massive networks. We present experimental results illustrating the effectiveness of our approach over a structural distance-based approach.