About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Abstract
Statistical relational learning (SRL) is a subarea in machine learning which addresses the problem of performing statistical inference on data that is correlated and not independently and identically distributed (i. i. d.)-as is generally assumed. For the traditional i. i. d. setting, distribution-free bounds exist, such as the Hoeffding bound, which are used to provide confidence bounds on the generalization error of a classification algorithm given its hold-out error on a sample size of N. Bounds of this form are currently not present for the type of interactions that are considered in the data by relational classification algorithms. In this paper, we extend the Hoeffding bounds to the relational setting. In particular, we derive distribution-free bounds for certain classes of data generation models that do not produce i. i. d. data and are based on the type of interactions that are considered by relational classification algorithms that have been developed in SRL. We conduct empirical studies on synthetic and real data which show that these data generation models are indeed realistic and the derived bounds are tight enough for practical use. © 2011 Springer-Verlag London Limited.