Jehanzeb Mirza, Leonid Karlinsky, et al.
NeurIPS 2023
This paper presents a technique for adding sentence boundaries to text obtained by Automatic Speech Recognition (ASR) of conversational speech audio. We show that starting with imprecise boundary information, added using only silence information from an ASR system, we can improve boundary detection using Head and Tail phrases. We develop our technique and show its effectiveness on two manually transcribed and one automatically transcribed corpus. The main purpose of adding sentence boundaries to ASR transcripts is to improve linguistic analysis, namely information extraction, for text mining systems that handle huge volumes of textual data and analyze trends and features of the concepts. Hence, we also show how the addition of boundaries improves two basic natural language processing tasks - PoS label assignment and adjective-noun extraction. © Springer-Verlag 2007.
Jehanzeb Mirza, Leonid Karlinsky, et al.
NeurIPS 2023
Upendra Sharma, Prashant Shenoy, et al.
ICCAC 2013
Eli Schwartz, Leonid Karlinsky, et al.
NeurIPS 2018
George Saon, Michael Picheny
ASRU 2007