Publication
EVENT 2001
Conference paper
Temporal events in all dimensions and scales
Abstract
This paper describes a new representation for the audio and visual information in a video signal. We use reduce the dimensionality of the signals with singular-value decomposition (SVD) or mel-frequency cepstral coefficients (MFCC). We apply these transforms to word, (word transcript, semantic space or latent semantic indexing), image (color histogram data) and audio (timbre) data. Using scale-space techniques we find large jumps in a video's path, which are evidence for events. We use these techniques to analyze the temporal properties of the audio and image data in a video. This analysis creates a hierarchical segmentation of the video, or a table-of-contents, from both audio and the image data.