About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IBM J. Res. Dev
Paper
Massive-scale learning of image and video semantic concepts
Abstract
Rapid growth in the capture and generation of images and videos is driving the need for more efficient and effective systems for analyzing, searching, and retrieving this data. Specific challenges include supporting automatic content indexing at a large scale and accurately extracting a sufficiently large number of relevant semantic concepts to enable effective search. In this paper, we describe the development of a system for massive-scale visual semantic concept extraction and learning for images and video. The system models the visual semantic space using a hierarchical faceted classification scheme across objects, scenes, people, activities, and events and utilizes a novel machine learning approach that creates ensemble classifiers from automatically extracted visual features. The ensemble learning and extraction processes are easily parallelizable for distributed processing using Hadoop® and IBM InfoSphere® Streams, which enable efficient processing of large data sets. We report on various applications and quantitative and qualitative results for different image and video data sets.