Frame level annotations for tennis videos

Mohak Sukhwani; C.V. Jawahar

doi:10.1109/ICPR.2016.7899740

ICPR 2016

Conference paper

04 Dec 2016

Frame level annotations for tennis videos

View publication

Abstract

Content based indexing is critical to the effective access of the multimedia data. To this end, visual data is often annotated with textual content for bridging the semantic gap. In this paper, we present a method to generate frame level fine grained annotations for a given video clip. Access to the frame level fine grained annotations lead to rich, dense and meaningful semantic associations between the text and video. This in turn makes the video retrieval systems more accurate. We demonstrate the use of probabilistic label consistent sparse coding and dictionary learning with a K-SVD algorithm to generate 'fine grained' annotations for a class of videos - lawn tennis. The algorithm simultaneously learns a classifier and a dictionary to generate the frame level annotations for the tennis videos using available textual descriptions. The utility of the proposed algorithm is demonstrated on a publicly available tennis dataset comprising of tennis match videos from Olympics games.

Conference paper