International Journal of Image and Graphics


View publication


This paper presents a new study on the application of the framework of Computational Media Aesthetics to the problem of automated understanding of film. Leveraging Film Grammar as the means to closing the "semantic gap" in media analysis, we examine film rhythm, a powerful narrative concept used to endow structure and form to the film compositionally and enhance its lyrical quality experientially. The novelty of this paper lies in the specification and investigation of the rhythmic elements that are present in two cinematic devices; namely motion and editing patterns, and their potential usefulness to automated content annotation and management systems. In our rhythm model, motion behavior is classified as being either nonexistent, fluid or staccato for a given shot. Shot neighborhoods in movies are then grouped by proportional makeup of these motion behavioral classes to yield seven high-level rhythmic arrangements that prove to be adept at indicating likely scene content (e.g. dialogue or chase sequence) in our experiments. The second part of our investigation presents a computational model to detect editing patterns as either metric, accelerated, decelerated or free. Details of the algorithm for the extraction of these classes are presented, along with experimental results on real movie data. We show with an investigation of combined rhythmic patterns that, while detailed content identification via rhythm types alone is not possible by virtue of the fact that film is not codified to this level in terms of rhythmic elements, analysis of the combined motion/editing rhythms can allow us to determine that the content has changed and hypothesize as to why this is so. We present three such categories of change and demonstrate their efficacy for capturing useful film elements (e.g. scene change precipitated by plot event), by providing data support from five motion pictures.