Alert button

"speech": models, code, and papers
Alert button

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Jul 06, 2023
Guinan Li, Jiajun Deng, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Mingyu Cui, Helen Meng, Xunying Liu

Figure 1 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 2 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 3 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 4 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Viaarxiv icon

Generative Speech Recognition Error Correction with Large Language Models

Sep 27, 2023
Chao-Han Huck Yang, Yile Gu, Yi-Chieh Liu, Shalini Ghosh, Ivan Bulyko, Andreas Stolcke

Figure 1 for Generative Speech Recognition Error Correction with Large Language Models
Figure 2 for Generative Speech Recognition Error Correction with Large Language Models
Figure 3 for Generative Speech Recognition Error Correction with Large Language Models
Figure 4 for Generative Speech Recognition Error Correction with Large Language Models
Viaarxiv icon

Sparse Finetuning for Inference Acceleration of Large Language Models

Oct 10, 2023
Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh

Viaarxiv icon

AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description

Oct 10, 2023
Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman

Figure 1 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 2 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 3 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 4 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Viaarxiv icon

Disentanglement in a GAN for Unconditional Speech Synthesis

Jul 04, 2023
Matthew Baas, Herman Kamper

Figure 1 for Disentanglement in a GAN for Unconditional Speech Synthesis
Figure 2 for Disentanglement in a GAN for Unconditional Speech Synthesis
Figure 3 for Disentanglement in a GAN for Unconditional Speech Synthesis
Figure 4 for Disentanglement in a GAN for Unconditional Speech Synthesis
Viaarxiv icon

Topic Identification For Spontaneous Speech: Enriching Audio Features With Embedded Linguistic Information

Jul 21, 2023
Dejan Porjazovski, Tamás Grósz, Mikko Kurimo

Viaarxiv icon

CB-Whisper: Contextual Biasing Whisper using TTS-based Keyword Spotting

Sep 18, 2023
Yuang Li, Yinglu Li, Min Zhang, Chang Su, Mengyao Piao, Xiaosong Qiao, Jiawei Yu, Miaomiao Ma, Yanqing Zhao, Hao Yang

Figure 1 for CB-Whisper: Contextual Biasing Whisper using TTS-based Keyword Spotting
Figure 2 for CB-Whisper: Contextual Biasing Whisper using TTS-based Keyword Spotting
Figure 3 for CB-Whisper: Contextual Biasing Whisper using TTS-based Keyword Spotting
Figure 4 for CB-Whisper: Contextual Biasing Whisper using TTS-based Keyword Spotting
Viaarxiv icon

L1-aware Multilingual Mispronunciation Detection Framework

Sep 14, 2023
Yassine El Kheir, Shammur Absar Chwodhury, Ahmed Ali

Figure 1 for L1-aware Multilingual Mispronunciation Detection Framework
Figure 2 for L1-aware Multilingual Mispronunciation Detection Framework
Figure 3 for L1-aware Multilingual Mispronunciation Detection Framework
Viaarxiv icon

Exploring Strategies for Modeling Sign Language Phonology

Sep 30, 2023
Lee Kezar, Riley Carlin, Tejas Srinivasan, Zed Sehyr, Naomi Caselli, Jesse Thomason

Figure 1 for Exploring Strategies for Modeling Sign Language Phonology
Figure 2 for Exploring Strategies for Modeling Sign Language Phonology
Figure 3 for Exploring Strategies for Modeling Sign Language Phonology
Viaarxiv icon

ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus

Jul 29, 2023
Tolulope Ogunremi, Kola Tubosun, Anuoluwapo Aremu, Iroro Orife, David Ifeoluwa Adelani

Figure 1 for ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
Figure 2 for ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
Viaarxiv icon