Alert button

"speech": models, code, and papers
Alert button

CycleGAN-Based Unpaired Speech Dereverberation

Add code
Bookmark button
Alert button
Mar 29, 2022
Hannah Muckenhirn, Aleksandr Safin, Hakan Erdogan, Felix de Chaumont Quitry, Marco Tagliasacchi, Scott Wisdom, John R. Hershey

Figure 1 for CycleGAN-Based Unpaired Speech Dereverberation
Figure 2 for CycleGAN-Based Unpaired Speech Dereverberation
Figure 3 for CycleGAN-Based Unpaired Speech Dereverberation
Viaarxiv icon

Listen, denoise, action! Audio-driven motion synthesis with diffusion models

Add code
Bookmark button
Alert button
Nov 17, 2022
Simon Alexanderson, Rajmund Nagy, Jonas Beskow, Gustav Eje Henter

Figure 1 for Listen, denoise, action! Audio-driven motion synthesis with diffusion models
Figure 2 for Listen, denoise, action! Audio-driven motion synthesis with diffusion models
Figure 3 for Listen, denoise, action! Audio-driven motion synthesis with diffusion models
Figure 4 for Listen, denoise, action! Audio-driven motion synthesis with diffusion models
Viaarxiv icon

SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech

Add code
Bookmark button
Alert button
Jun 24, 2022
Hyunjae Cho, Wonbin Jung, Junhyeok Lee, Sang Hoon Woo

Figure 1 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 2 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 3 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Figure 4 for SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Viaarxiv icon

Improved Meta Learning for Low Resource Speech Recognition

May 11, 2022
Satwinder Singh, Ruili Wang, Feng Hou

Figure 1 for Improved Meta Learning for Low Resource Speech Recognition
Figure 2 for Improved Meta Learning for Low Resource Speech Recognition
Figure 3 for Improved Meta Learning for Low Resource Speech Recognition
Figure 4 for Improved Meta Learning for Low Resource Speech Recognition
Viaarxiv icon

Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis

Add code
Bookmark button
Alert button
Apr 03, 2022
Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng

Figure 1 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 2 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 3 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 4 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Viaarxiv icon

Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition

Jul 23, 2021
Arun Kumar Singh, Priyanka Singh, Karan Nathwani

Figure 1 for Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition
Figure 2 for Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition
Figure 3 for Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition
Figure 4 for Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition
Viaarxiv icon

Selecting and combining complementary feature representations and classifiers for hate speech detection

Add code
Bookmark button
Alert button
Jan 18, 2022
Rafael M. O. Cruz, Woshington V. de Sousa, George D. C. Cavalcanti

Figure 1 for Selecting and combining complementary feature representations and classifiers for hate speech detection
Figure 2 for Selecting and combining complementary feature representations and classifiers for hate speech detection
Figure 3 for Selecting and combining complementary feature representations and classifiers for hate speech detection
Figure 4 for Selecting and combining complementary feature representations and classifiers for hate speech detection
Viaarxiv icon

InQSS: a speech intelligibility assessment model using a multi-task learning network

Add code
Bookmark button
Alert button
Nov 04, 2021
Yu-Wen Chen, Yu Tsao

Figure 1 for InQSS: a speech intelligibility assessment model using a multi-task learning network
Figure 2 for InQSS: a speech intelligibility assessment model using a multi-task learning network
Figure 3 for InQSS: a speech intelligibility assessment model using a multi-task learning network
Figure 4 for InQSS: a speech intelligibility assessment model using a multi-task learning network
Viaarxiv icon

Injecting Text in Self-Supervised Speech Pretraining

Aug 27, 2021
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro Moreno

Figure 1 for Injecting Text in Self-Supervised Speech Pretraining
Figure 2 for Injecting Text in Self-Supervised Speech Pretraining
Figure 3 for Injecting Text in Self-Supervised Speech Pretraining
Figure 4 for Injecting Text in Self-Supervised Speech Pretraining
Viaarxiv icon

Decoding High-level Imagined Speech using Attention-based Deep Neural Networks

Dec 13, 2021
Dae-Hyeok Lee, Sung-Jin Kim, Keon-Woo Lee

Figure 1 for Decoding High-level Imagined Speech using Attention-based Deep Neural Networks
Figure 2 for Decoding High-level Imagined Speech using Attention-based Deep Neural Networks
Figure 3 for Decoding High-level Imagined Speech using Attention-based Deep Neural Networks
Figure 4 for Decoding High-level Imagined Speech using Attention-based Deep Neural Networks
Viaarxiv icon