About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
MMSP 2007
Conference paper
An embedded system for in-vehicle visual speech activity detection
Abstract
We present a system for automatically detecting driver's speech in the automobile domain using visual-only information extracted from the driver's mouth region. The work is motivated by the desire to eliminate manual push-to-talk activation of the speech recognition engine in newly designed voice interfaces in the typically noisy car environment, aiming at reducing driver cognitive load and increasing naturalness of the interaction. The proposed system uses a camera mounted on the rearview mirror to monitor the driver, detect face boundaries and facial features, and finally employ lip motion clues to recognize visual speech activity. In particular, the designed algorithm has very low computational cost, which allows real-time implementation on currently available inexpensive embedded platforms, as described in the paper. Experiments are also reported on a small multi-speaker database collected in moving automobiles, that demonstrate promising accuracy. ©2007 IEEE.