About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Abstract
This paper describes a system for connecting sounds and words in linked multi-dimensional vector spaces. The acoustic space is represented using anchor models and partitioned using agglomerative clustering. The semantic space is modeled by a hierarchical multinomial clustering model. Nodes in one space are linked by probabilistic models to the other space. With these linked models, users retrieve sounds with natural language, and the system describes new sounds with words.