Listening to natural and synthesized speech while driving: Effects on user performance
Abstract
The effects of message type (navigation, E-mail, news story), voice type (text-to-speech, natural human speech), and earcon cueing (present, absent) on message comprehension and driving performance were examined. Twenty-four licensed drivers (12 under 30, 12 over 65, both equally divided by gender) participated in the experiment. They drove the UMTRI driving simulator on a road consisting of straight sections and constant radius curves, thus yielding two levels of low driving-workload. In addition, as a control condition, data were collected while participants were parked. In all conditions, participants were presented with three types of messages. Each message was immediately followed by a series of questions to assess comprehension. Navigation messages were about 4 seconds long (about 9 words). E-mail messages were about 40 seconds long (about 100 words) and news messages were about 80 seconds long (about 225 words). For all message types, comprehension of text-to-speech messages, as determined by accuracy of response to questions, and by subjective ratings, was significantly worse than comprehension of natural speech (79 versus 83 percent correct answers; 7.7/10 versus 8.6/10 subjective rating). Driving workload did not affect comprehension. Interestingly, neither the speech used (synthesized or natural) nor the message type (navigation, E-mail, news) had a significant effect on basic driving performance measured by the standard deviations of lateral lane position and steering wheel angle.