Speaker recognition using common passphrases in RedDots
In this paper we report our work on the recently collected text dependent speaker recognition dataset named RedDots, with a focus on the common passphrase condition. We first investigate an out-of-the-box approach. We then report several strategies to train on RedDots itself using up to 40 speakers for training. The GMM-NAP framework is used as a baseline. We report the following novelties: First, we demonstrate the use of bagging for improved accuracy. Second, we estimate the EER of a passphrase using metadata only. Third, the estimated EERs are used for improved score normalization. Finally we report an analysis of system sensitivity to the duration between enrollment and testing (template aging).