An embedded system for in-vehicle visual speech activity detection

Vit Libal; Jonathan Connell; Gerasimos Potamianos; Etienne Marcheret

doi:10.1109/MMSP.2007.4412866

MMSP 2007

Conference paper

01 Dec 2007

An embedded system for in-vehicle visual speech activity detection

View publication

Abstract

We present a system for automatically detecting driver's speech in the automobile domain using visual-only information extracted from the driver's mouth region. The work is motivated by the desire to eliminate manual push-to-talk activation of the speech recognition engine in newly designed voice interfaces in the typically noisy car environment, aiming at reducing driver cognitive load and increasing naturalness of the interaction. The proposed system uses a camera mounted on the rearview mirror to monitor the driver, detect face boundaries and facial features, and finally employ lip motion clues to recognize visual speech activity. In particular, the designed algorithm has very low computational cost, which allows real-time implementation on currently available inexpensive embedded platforms, as described in the paper. Experiments are also reported on a small multi-speaker database collected in moving automobiles, that demonstrate promising accuracy. ©2007 IEEE.

Conference paper