Alert button

"speech": models, code, and papers
Alert button

StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis

Add code
Bookmark button
Alert button
May 30, 2022
Yinghao Aaron Li, Cong Han, Nima Mesgarani

Figure 1 for StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Figure 2 for StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Figure 3 for StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Figure 4 for StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Viaarxiv icon

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition

Oct 31, 2022
Suyoun Kim, Ke Li, Lucas Kabela, Rongqing Huang, Jiedan Zhu, Ozlem Kalinli, Duc Le

Figure 1 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 2 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 3 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 4 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Viaarxiv icon

Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis

May 09, 2022
Zhenzi Weng, Zhijin Qin, Xiaoming Tao, Chengkang Pan, Guangyi Liu, Geoffrey Ye Li

Figure 1 for Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Figure 2 for Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Figure 3 for Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Figure 4 for Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Viaarxiv icon

Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source

Aug 09, 2022
Sania Gul, Muhammad Salman Khan, Syed Waqar Shah, Ata Ur-Rehman

Figure 1 for Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source
Figure 2 for Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source
Figure 3 for Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source
Figure 4 for Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source
Viaarxiv icon

Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator

Add code
Bookmark button
Alert button
Feb 27, 2023
Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, Boris Ginsburg

Figure 1 for Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Figure 2 for Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Figure 3 for Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Figure 4 for Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Viaarxiv icon

Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems

Apr 11, 2022
Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury

Figure 1 for Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Figure 2 for Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Figure 3 for Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Viaarxiv icon

Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition

May 17, 2022
Zengrui Jin, Mengzhe Geng, Jiajun Deng, Tianzi Wang, Shujie Hu, Guinan Li, Xunying Liu

Figure 1 for Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Figure 2 for Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Figure 3 for Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Figure 4 for Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Viaarxiv icon

Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings

Add code
Bookmark button
Alert button
Jan 16, 2023
Kai Liu, Xucheng Wan, Ziqing Du, Huan Zhou

Figure 1 for Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings
Figure 2 for Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings
Figure 3 for Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings
Figure 4 for Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings
Viaarxiv icon

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion

Add code
Bookmark button
Alert button
Mar 16, 2023
Hyun Joon Park, Seok Woo Yang, Jin Sob Kim, Wooseok Shin, Sung Won Han

Figure 1 for TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Figure 2 for TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Figure 3 for TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Figure 4 for TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Viaarxiv icon

DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model

Add code
Bookmark button
Alert button
Mar 16, 2023
Yanzhe Fu, Yueteng Kang, Songjun Cao, Long Ma

Figure 1 for DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model
Figure 2 for DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model
Figure 3 for DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model
Figure 4 for DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model
Viaarxiv icon