Semantic word embedding neural network language models for automatic speech recognition

Kartik Audhkhasi; Abhinav Sethy; Bhuvana Ramabhadran

doi:10.1109/ICASSP.2016.7472828

ICASSP 2016

Conference paper

18 May 2016

Semantic word embedding neural network language models for automatic speech recognition

View publication

Abstract

Semantic word embeddings have become increasingly important in natural language processing tasks over the last few years. This popularity is due to their ability to easily capture rich semantic information through a distributed representation and the availability of fast and scalable algorithms for learning them from large text corpora. State-of-the-art neural network language models (NNLMs) used in automatic speech recognition (ASR) and natural language processing also learn word embeddings optimized to model local N-gram dependencies given training text but are not optimized to capture semantic information. We hypothesize that semantic word embeddings provide diverse information compared to the word embeddings learned by NNLMs. We propose novel feedforward NNLM architectures that incorporate semantic word embeddings. We apply the resulting NNLMs to ASR on broadcast news and show improvements in both perplexity and word error rate.

Conference paper