Hypothesis selection and testing by the MDL principle

J. Rlssanen

Paper

01 Dec 1999

Hypothesis selection and testing by the MDL principle

Abstract

The central idea of the MDL (Minimum Description Length) principle is to represent a class of models (hypotheses) by a universal model capable of imitating the behavior of any model in the class. The principle calls for a model class whose representative assigns the largest probability or density to the observed data. Two examples of universal models for parametric classes M are the normalized maximum likelihood (NML) model f(xn | M) = f(xn | e(xn)) f /(yn | (yn))dyn, where is an appropriately selected set, and a mixture fw(x\M) = I f(xe)w(6)d9 as a convex linear functional of the models. In this interpretation a Bayes factor fω(xn \f(xn|θ) θ)ω(θ)dθ of mixture representatives of two model classes. However, mixtures need not be the best representatives, and as will be shown the NML model provides a strictly better test for the mean being zero in the Gaussian cases where the variance is known or taken as a parameter.

Paper