About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Abstract
Face recognition has recently attracted increasing attention and is beginning to be applied in a variety of domains, predominantly for security, but also for video indexing. This paper describes the application of a face recognition system to video indexing, with the joint purpose of labelling faces in the video, and identifying speakers. The face recognition system can be used to supplement acoustic speaker identification, when the speaker's face is shown, to allow indexing of the speakers, as well as the selection of the correct speaker-dependent model for speech transcription. This paper describes the feature detection and recognition methods used by the system, and describes a new method of aggregating multiple Gabor jet representations for a whole sequence. Several approaches to using such aggregate representation for recognition of faces in image sequences are compared. Results are presented showing a significant improvement in recognition rates when the whole sequence is used instead of a single image of the face.