Audio-visual event detection using duration dependent input output Markov models

M. Naphade; A. Garg; T.S. Huang

doi:10.1109/IVL.2001.990854

CBAIVL 2001

Conference paper

14 Dec 2001

Audio-visual event detection using duration dependent input output Markov models

View publication

Abstract

Analysis of audio-visual data and detection of semantic events with spatio-temporal support is a challenging multimedia understanding problem. The difficulty lies in the gap that exists between low level media features and high level semantic concept. We introduce a duration dependent input output Markov model (DDIOMM) to detect events based on multiple modalities. The DDIOMM combines the ability to model non-exponential duration densities with the mapping of input sequences to output sequences. We test the DDIOMM by modelling the audio-visual event explosion. We compare the detection performance of the DDIOMM with the IOMM as well as the HMM. Experiments reveal that modeling of duration improves detection performance.

Conference paper