Alert button

"speech": models, code, and papers
Alert button

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra

Add code
Bookmark button
Alert button
May 23, 2023
Ye-Xin Lu, Yang Ai, Zhen-Hua Ling

Figure 1 for MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
Figure 2 for MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
Figure 3 for MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
Figure 4 for MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
Viaarxiv icon

Speaker-independent Speech Inversion for Estimation of Nasalance

Add code
Bookmark button
Alert button
May 31, 2023
Yashish M. Siriwardena, Carol Espy-Wilson, Suzanne Boyce, Mark K. Tiede, Liran Oren

Figure 1 for Speaker-independent Speech Inversion for Estimation of Nasalance
Figure 2 for Speaker-independent Speech Inversion for Estimation of Nasalance
Figure 3 for Speaker-independent Speech Inversion for Estimation of Nasalance
Figure 4 for Speaker-independent Speech Inversion for Estimation of Nasalance
Viaarxiv icon

Home monitoring for frailty detection through sound and speaker diarization analysis

Aug 17, 2023
Yannis Tevissen, Dan Istrate, Vincent Zalc, Jérôme Boudy, Gérard Chollet, Frédéric Petitpont, Sami Boutamine

Figure 1 for Home monitoring for frailty detection through sound and speaker diarization analysis
Figure 2 for Home monitoring for frailty detection through sound and speaker diarization analysis
Figure 3 for Home monitoring for frailty detection through sound and speaker diarization analysis
Figure 4 for Home monitoring for frailty detection through sound and speaker diarization analysis
Viaarxiv icon

UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures

Add code
Bookmark button
Alert button
May 31, 2023
Zhong-Qiu Wang, Shinji Watanabe

Figure 1 for UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures
Figure 2 for UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures
Figure 3 for UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures
Figure 4 for UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures
Viaarxiv icon

SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization

Add code
Bookmark button
Alert button
Jun 21, 2023
Changhun Kim, Joonhyung Park, Hajin Shim, Eunho Yang

Figure 1 for SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization
Figure 2 for SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization
Figure 3 for SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization
Figure 4 for SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization
Viaarxiv icon

An analysis on the effects of speaker embedding choice in non auto-regressive TTS

Jul 19, 2023
Adriana Stan, Johannah O'Mahony

Figure 1 for An analysis on the effects of speaker embedding choice in non auto-regressive TTS
Figure 2 for An analysis on the effects of speaker embedding choice in non auto-regressive TTS
Figure 3 for An analysis on the effects of speaker embedding choice in non auto-regressive TTS
Figure 4 for An analysis on the effects of speaker embedding choice in non auto-regressive TTS
Viaarxiv icon

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Add code
Bookmark button
Alert button
Mar 07, 2023
Mohamed Anwar, Bowen Shi, Vedanuj Goswami, Wei-Ning Hsu, Juan Pino, Changhan Wang

Figure 1 for MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Figure 2 for MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Figure 3 for MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Figure 4 for MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Viaarxiv icon

TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio

Apr 04, 2023
Anurag Kumar, Ke Tan, Zhaoheng Ni, Pranay Manocha, Xiaohui Zhang, Ethan Henderson, Buye Xu

Figure 1 for TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio
Figure 2 for TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio
Figure 3 for TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio
Figure 4 for TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio
Viaarxiv icon

From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion

Add code
Bookmark button
Alert button
Aug 02, 2023
Robin San Roman, Yossi Adi, Antoine Deleforge, Romain Serizel, Gabriel Synnaeve, Alexandre Défossez

Viaarxiv icon

FonMTL: Towards Multitask Learning for the Fon Language

Add code
Bookmark button
Alert button
Aug 28, 2023
Bonaventure F. P. Dossou, Iffanice Houndayi, Pamely Zantou, Gilles Hacheme

Figure 1 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 2 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 3 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 4 for FonMTL: Towards Multitask Learning for the Fon Language
Viaarxiv icon