Speech
As more of the world moves online, the demand for systems that can understand users and speak to them in natural language is growing exponentially. At IBM Research, we're working on next-generation AI that learns to decipher and replicate the way humans speak.
Our work
Training a customer service bot to sound more human
ResearchKim MartineauConverting several audio streams into one voice makes it easier for AI to learn
ResearchKim MartineauThe pandemic changed the way we understand speech
ResearchRachel OstrandAustin or Boston? Making artificial speech more expressive, natural, and controllable
ResearchSlava Shechtman, Raul Fernandez, and David Haws8 minute readSpeech-to-text AI could help doctors prescribe placebo to ease chronic pain
ResearchSara Berger6 minute readA cognitive in-car companion to help us enjoy the journey
Research4 minute read
Publications
Beyond neuropsychological tests: AI speech analysis in PKU
- Susan Waisbren
- Kely Norel
- et al.
- 2024
- J. Inherit. Metab. Dis.
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
- Yuchen Hu
- Chen Chen
- et al.
- 2024
- NeurIPS 2024
Robust ASR Error Correction with Conservative Data Filtering
- Takuma Udagawa
- Masayuki Suzuki
- et al.
- 2024
- EMNLP 2024
Exploring the Benefits of Tokenization of Discrete Acoustic Units
- Avihu Dekel
- Raul Fernandez
- 2024
- INTERSPEECH 2024
Exploring the limits of decoder-only models trained on public speech recognition corpora
- Ankit Gupta
- George Saon
- et al.
- 2024
- INTERSPEECH 2024
M2 ASR: Multilingual Multi-task Automatic Speech Recognition via Multi-objective Optimization
- A Saif
- Lisha Chen
- et al.
- 2024
- INTERSPEECH 2024