About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Intelligent Data Analysis
Paper
Fast ordering of large categorical datasets for visualization
Abstract
An important issue in visualizing categorical data is how to order categorical values-non-numeric values that do not have a natural ordering, which makes it difficult to map them to visual coordinates. The focus of this paper is on constructing categorical orderings efficiently without compromising their visual quality. In order to avoid the inherent intractability of previous discrete formulations, we consider a continuous relaxation of the problem solvable exactly using the spectral method. The latter is based on computing certain algebraic information about the similarity matrix of the dataset. However, even computing the similarity matrix itself is prohibitive for large datasets. In order to achieve greater efficiency, we propose a new multi-level scheme based on an approximate representation of the matrix. We show that it sufficient to compute only a small portion of the matrix of size linear in the number of objects, as opposed to quadratic, to guarantee a small probability of approximation error. Thus an effective ordering can be constructed without actually having to compute most pairwise similarities of values. Experiments have been conducted to qualitatively verify the effectiveness of resulting visualizations. © 2002-IOS Press. All rights reserved.