About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IEEE TCSVT
Paper
Construction of Diverse Image Datasets from Web Collections with Limited Labeling
Abstract
Image datasets play a pivotal role in advancing computer vision and multimedia research. However, most of the datasets are created by extensive human effort and are extremely expensive to scale-up. To address these issues, several automatic and semi-automatic approaches have been proposed for creating datasets by refining web images. However, these approaches either include significant redundant images in the dataset or fail to provide a diverse enough set to train a robust classifier. Ideally, a representative subset should be both semantically and visually diverse so as to provide the maximum amount of information under the current budget. Most current approaches are entirely based on the analysis of visual features, which may not correlate well with image semantics, and hence, collected images may not be sufficient to give a detailed understanding of a category. In this paper, we propose a system for creating diverse image dataset collections from the web with limited manual labeling effort. It is based upon a semi-supervised sparse coding framework that employs a joint visual-semantic space to simultaneously utilize both the images and associated textual information from the web for dataset construction. In addition, the proposed system is online and is capable of collecting more discriminative images continuously as new data becomes available, which is also suitable for enriching the existing datasets. The experiments demonstrate that our system can create and enrich datasets with limited manual labeling, with better cross-dataset generalization capability and diversity compared to the state-of-the-art datasets.