The Iroquois Model: Using Temporal Dynamics to Separate Speakers

Steven Rennie; Peder Olsen; John Hershey; Trausti Kristjansson

SAPA 2006

Conference paper

16 Sep 2006

The Iroquois Model: Using Temporal Dynamics to Separate Speakers

Abstract

We describe a system that can separate and recognize the simultaneous speech of two speakers from a single channel recording and compare the performance of the system to that of human subjects. The system, which we call Iroquois, uses models of dynamics to achieve performance near that of human listeners. However the system exhibits a pattern of performance across conditions that is different from that of human subjects. In conditions where the amplitude of the speakers is similar, the Iroquois model surpasses human performance by over 50%. We hypothesize that the system accomplishes this remarkable feat by employing a different strategy to that of the human auditory system.

Conference paper