Publication
ASSETS 2018
Conference paper
Leveraging pauses to improve video captions
Abstract
Currently, video sites that offer automatic speech recognition display the auto-generated captions as arbitrarily segmented lines of unpunctuated text. This method of displaying captions can be detrimental to meaning, especially for deaf users who rely almost exclusively on captions. However, the captions can be made more readable by automatically detecting pauses in speech and using the pause duration as a determinant both for inserting simple punctuation and for more meaningfully segmenting and timing the display of lines of captions. A small sampling of users suggests that such adaptations to caption display are preferred by a majority of users, whether they are deaf, hard of hearing or hearing.