Spatial-DiscLDA for visual recognition

Zhenxing Niu; Gang Hua; Xinbo Gao; Qi Tian

doi:10.1109/CVPR.2011.5995426

CSCCVPR 2011

Conference paper

20 Jun 2011

Spatial-DiscLDA for visual recognition

View publication

Abstract

Topic models such as pLSA, LDA and their variants have been widely adopted for visual recognition. However, most of the adopted models, if not all, are unsupervised, which neglected the valuable supervised labels during model training. In this paper, we exploit recent advancement in supervised topic modeling, more particularly, the DiscLDA model for object recognition. We extend it to a part based visual representation to automatically identify and model different object parts. We call the proposed model as Spatial-DiscLDA (S-DiscLDA). It models the appearances and locations of the object parts simultaneously, which also takes the supervised labels into consideration. It can be directly used as a classifier to recognize the object. This is performed by an approximate inference algorithm based on Gibbs sampling and bridge sampling methods. We examine the performance of our model by comparing its performance with another supervised topic model on two scene category datasets, i.e., LabelMe and UIUC-sport dataset. We also compare our approach with other approaches which model spatial structures of visual features on the popular Caltech-4 dataset. The experimental results illustrate that it provides competitive performance. © 2011 IEEE.

Conference paper