Nobuyasu Itoh, Gakuto Kurata, et al.
INTERSPEECH 2015
This paper proposes a novel weighting algorithm for Cross-power Spectrum Phase (CSP) analysis to improve the accuracy of direction of arrival (DOA) estimation for beamforming in a noisy environment. Our sound source is a human speaker and the noise is broadband noise in an automobile. The harmonic structures in the human speech spectrum can be used for weighting the CSP analysis, because harmonic bins must contain more speech power than the others and thus give us more reliable information. However, most conventional methods leveraging harmonic structures require pitch estimation with voiced-unvoiced classification, which is not sufficiently accurate in noisy environments. In our new approach, the observed power spectrum is directly converted into weights for the CSP analysis by retaining only the local peaks considered to be harmonic structures. Our experiment showed the proposed approach significantly reduced the errors in localization, and it showed further improvements when used with other weighting algorithms. © 2010 Osamu Ichikawa et al.
Nobuyasu Itoh, Gakuto Kurata, et al.
INTERSPEECH 2015
Tohru Nagano, Shinsuke Mori, et al.
INTERSPEECH - Eurospeech 2005
Ryuki Tachibana, Zhiwei Shuang, et al.
INTERSPEECH 2009
Gakuto Kurata, Shinsuke Mori, et al.
ICASSP 2006