About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2015
Conference paper
Removing data with noisy responses in regression analysis
Abstract
In regression analysis, outliers in the data can induce a bias in the learned function, resulting in larger errors. In this paper we derive an empirically estimable bound on the regression error based on a Euclidean minimum spanning tree generated from the data. Using this bound as motivation, we propose an iterative approach to remove data with noisy responses from the training set. We evaluate the performance of the algorithm on experiments with real-world pathological speech (speech from individuals with neurogenic disorders). Comparative results show that removing noisy examples during training using the proposed approach yields better predictive performance on out-of-sample data.