About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICDM 2005
Conference paper
Effective estimation of posterior probabilities: Explaining the accuracy of randomized decision tree approaches
Abstract
There has been increasing number of independently proposed randomization methods in different stages of decision tree construction to build multiple trees. Randomized decision tree methods have been reported to be significantly more accurate than widely-accepted single decision trees, although the training procedure of some methods incorporates a surprisingly random factor and therefore opposes the generally accepted idea of employing gain functions to choose optimum features at each node and compute a single tree that fits the data. One important question that is not well understood yet is the reason behind the high accuracy. We provide an insight based on posterior probability estimations. We first establish the relationship between effective posterior probability estimation and effective loss reduction. We argue that randomized decision tree methods effectively approximate the true probability distribution using the decision tree hypothesis space. We conduct experiments using both synthetic and real-world datasets under both 0-1 and cost-sensitive loss functions. © 2005 IEEE.