About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
STOC 2002
Conference paper
The complexity of approximating entropy
Abstract
We consider the problem of approximating the entropy of a discrete distribution under several models. If the distribution is given explicitly as an array where the i-th location is the probability of the i-th element, then linear time is both necessary and sufficient for approximating the entropy. We consider a model in which the algorithm is given access only to independent samples from the distribution. Here, we show that a γ-multiplicative approximation to the entropy can be obtained in O (n(1+η)/γ(2) poly(log n)) time for distributions with entropy Ω(γ/η), where n is the size of the domain of the distribution and η is an arbitrarily small positive constant. We show that one cannot get a multiplicative approximation to the entropy in general in this model. Even for the class of distributions to which our upper bound applies, we obtain a lower bound of Ω (nmax(1/(2γ2),2/(5γ2-2))). We next consider a hybrid model in which both the explicit distribution as well as independent samples are available. Here, significantly more efficient algorithms can be achieved: a γ-multiplicative approximation to the entropy can be obtained in O (γ2 log2 n/h2(γ-1)2) time for distributions with entropy Ω(h); we show a lower bound of Ω (log n/h(γ2-1)). Finally, we consider two special families of distributions: those for which the probability of an element decreases monotonically in the label of the element, and those that are uniform over a subset of the domain. In each case, we give more efficient algorithms for approximating the entropy.