End-to-end videotext recognition for multimedia content analysis

Chitra Dorai; Hrishikesh Aradhye; Jae-Chang Shim

doi:10.1109/ICME.2001.1237761

ICME 2001

Conference paper

22 Aug 2001

End-to-end videotext recognition for multimedia content analysis

View publication

Abstract

Videotext refers to text superimposed on still images and video frames, and a videotext based Multimedia Description Scheme has recently been adopted into the MPEG-7 standard as one of the normative media content description interfaces. While much of the previous work including ours concentrates on the task of locating and extracting text from the video frames automatically, very little research has focused on reliably recognizing segmented text. The low resolution of videotext, unconstrained font styles and sizes, poor separation of characters often resulting from video compression and decoding, all pose severe problems even to commercial OCRs in recognizing videotext accurately. This paper describes a novel end-to-end video character recognition system featuring new character attributes emphasizing macro shapes, a Support Vector Machine-based character classifier, videotext object synthesis, font context analysis, and temporal contiguity analysis, to successfully address the issues confounding accurate videotext recognition. We present results from our experiments with real video data that demonstrate the strengths of this system.

Conference paper