Fishervoice and semi-supervised speaker clustering

Stephen M. Chu; Hao Tang; Thomas S. Huang

doi:10.1109/ICASSP.2009.4960527

ICASSP 2009

Conference paper

23 Sep 2009

Fishervoice and semi-supervised speaker clustering

View publication

Abstract

Speaker subspace modeling has become increasingly important in speaker recognition, diarization, and clustering. Principal component analysis (PCA) is a popular linear subspace learning technique and the approach that represents an arbitrary utterance or speaker as a linear combination of a set of basis voices based on PCA is known as the eigenvoice approach. In this paper, a novel technique, namely the fishervoice approach, is proposed. The fishervoice approach is based on linear discriminant analysis, another successful linear subspace learning technique that provides an optimized low-dimensional representation of utterances or speakers with focus on the most discriminative basis voices. We apply the fishervoice approach to speaker clustering in a semi-supervised manner and show that the fishervoice approach significantly outperforms the eigenvoice approach in all our experiments on the GALE Mandarin dataset. ©2009 IEEE.

Paper