Alert button

"speech": models, code, and papers
Alert button

Autodecompose: A generative self-supervised model for semantic decomposition

Add code
Bookmark button
Alert button
Feb 06, 2023
Mohammad Reza Bonyadi

Figure 1 for Autodecompose: A generative self-supervised model for semantic decomposition
Figure 2 for Autodecompose: A generative self-supervised model for semantic decomposition
Figure 3 for Autodecompose: A generative self-supervised model for semantic decomposition
Figure 4 for Autodecompose: A generative self-supervised model for semantic decomposition
Viaarxiv icon

Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition

Add code
Bookmark button
Alert button
Mar 28, 2022
Yuchen Hu, Nana Hou, Chen Chen, Eng Siong Chng

Figure 1 for Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Figure 2 for Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Figure 3 for Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Figure 4 for Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Viaarxiv icon

Probing Statistical Representations For End-To-End ASR

Nov 03, 2022
Anna Ollerenshaw, Md Asif Jalal, Thomas Hain

Figure 1 for Probing Statistical Representations For End-To-End ASR
Figure 2 for Probing Statistical Representations For End-To-End ASR
Figure 3 for Probing Statistical Representations For End-To-End ASR
Figure 4 for Probing Statistical Representations For End-To-End ASR
Viaarxiv icon

On-device neural speech synthesis

Sep 17, 2021
Sivanand Achanta, Albert Antony, Ladan Golipour, Jiangchuan Li, Tuomo Raitio, Ramya Rasipuram, Francesco Rossi, Jennifer Shi, Jaimin Upadhyay, David Winarsky, Hepeng Zhang

Figure 1 for On-device neural speech synthesis
Figure 2 for On-device neural speech synthesis
Figure 3 for On-device neural speech synthesis
Figure 4 for On-device neural speech synthesis
Viaarxiv icon

Remix-cycle-consistent Learning on Adversarially Learned Separator for Accurate and Stable Unsupervised Speech Separation

Mar 26, 2022
Kohei Saijo, Tetsuji Ogawa

Figure 1 for Remix-cycle-consistent Learning on Adversarially Learned Separator for Accurate and Stable Unsupervised Speech Separation
Figure 2 for Remix-cycle-consistent Learning on Adversarially Learned Separator for Accurate and Stable Unsupervised Speech Separation
Figure 3 for Remix-cycle-consistent Learning on Adversarially Learned Separator for Accurate and Stable Unsupervised Speech Separation
Viaarxiv icon

Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0

Add code
Bookmark button
Alert button
Sep 27, 2022
Bagus Tris Atmaja, Akira Sasou

Figure 1 for Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Figure 2 for Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Figure 3 for Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Figure 4 for Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Viaarxiv icon

Defense Against Adversarial Attacks on Audio DeepFake Detection

Add code
Bookmark button
Alert button
Dec 30, 2022
Piotr Kawa, Marcin Plata, Piotr Syga

Figure 1 for Defense Against Adversarial Attacks on Audio DeepFake Detection
Figure 2 for Defense Against Adversarial Attacks on Audio DeepFake Detection
Figure 3 for Defense Against Adversarial Attacks on Audio DeepFake Detection
Figure 4 for Defense Against Adversarial Attacks on Audio DeepFake Detection
Viaarxiv icon

Conversation-oriented ASR with multi-look-ahead CBS architecture

Nov 02, 2022
Huaibo Zhao, Shinya Fujie, Tetsuji Ogawa, Jin Sakuma, Yusuke Kida, Tetsunori Kobayashi

Figure 1 for Conversation-oriented ASR with multi-look-ahead CBS architecture
Figure 2 for Conversation-oriented ASR with multi-look-ahead CBS architecture
Figure 3 for Conversation-oriented ASR with multi-look-ahead CBS architecture
Viaarxiv icon

Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis

Mar 02, 2022
Pengyu Cheng, Zhenhua Ling

Figure 1 for Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
Figure 2 for Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
Figure 3 for Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
Figure 4 for Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
Viaarxiv icon

Whose Emotion Matters? Speaker Detection without Prior Knowledge

Add code
Bookmark button
Alert button
Dec 08, 2022
Hugo Carneiro, Cornelius Weber, Stefan Wermter

Figure 1 for Whose Emotion Matters? Speaker Detection without Prior Knowledge
Figure 2 for Whose Emotion Matters? Speaker Detection without Prior Knowledge
Figure 3 for Whose Emotion Matters? Speaker Detection without Prior Knowledge
Figure 4 for Whose Emotion Matters? Speaker Detection without Prior Knowledge
Viaarxiv icon