Statistical outlier detection using direct density ratio estimation

Shohei Hido; Yuta Tsuboi; Hisashi Kashima; Masashi Sugiyama; Takafumi Kanamori

doi:10.1007/s10115-010-0283-2

KAIS

Paper

10 Feb 2010

Statistical outlier detection using direct density ratio estimation

View publication

Abstract

We propose a new statistical approach to the problem of inlier-based outlier detection, i. e., finding outliers in the test set based on the training set consisting only of inliers. Our key idea is to use the ratio of training and test data densities as an outlier score. This approach is expected to have better performance even in high-dimensional problems since methods for directly estimating the density ratio without going through density estimation are available. Among various density ratio estimation methods, we employ the method called unconstrained least-squares importance fitting (uLSIF) since it is equipped with natural cross-validation procedures, allowing us to objectively optimize the value of tuning parameters such as the regularization parameter and the kernel width. Furthermore, uLSIF offers a closed-form and real-world datasets illustrate the usefulness of the proposed approach. © 2010 Springer-Verlag London Limited.

Conference paper