Efficient prototype selection supported by subspace partitions

Joel Luís Carbonera; Mara Abel

doi:10.1109/ICTAI.2017.00142

ICTAI 2017

Conference paper

04 Jun 2018

Efficient prototype selection supported by subspace partitions

View publication

Abstract

Nowadays, data mining approaches have been applied for extracting useful knowledge from a huge volume of data. In order to deal with this big data, techniques for prototype (or instance) selection have been applied for reducing the data to a manageable volume and, consequently, for reducing the computational resources that are necessary to apply data mining approaches. However, most of the proposed approaches for prototype selection have a high time complexity and, due to this, they cannot be applied for dealing with big data. In this paper, we propose an efficient approach for prototype selection called PSSP. It adopts the notion of subspace partition for efficiently splitting the dataset in sets of similar instances. In a second step, the algorithm extracts a prototype of each of the previously identified sets. The approach was evaluated on 11 well-known datasets used in a classification task, and its performance was compared to those of 6 state-of-The-Art algorithms, considering three measures: Accuracy, reduction, and effectiveness. All the obtained results show that, in general, the proposed approach provides a good trade-off between accuracy and reduction, with a significantly lower running time, when compared with other approaches.

Paper