Alert button

"speech": models, code, and papers
Alert button

Efficient Speech Representation Learning with Low-Bit Quantization

Dec 14, 2022
Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Abdelrahman Mohamed

Figure 1 for Efficient Speech Representation Learning with Low-Bit Quantization
Figure 2 for Efficient Speech Representation Learning with Low-Bit Quantization
Figure 3 for Efficient Speech Representation Learning with Low-Bit Quantization
Viaarxiv icon

Cross-lingual Alzheimer's Disease detection based on paralinguistic and pre-trained features

Mar 14, 2023
Xuchu Chen, Yu Pu, Jinpeng Li, Wei-Qiang Zhang

Figure 1 for Cross-lingual Alzheimer's Disease detection based on paralinguistic and pre-trained features
Figure 2 for Cross-lingual Alzheimer's Disease detection based on paralinguistic and pre-trained features
Viaarxiv icon

AudioSlots: A slot-centric generative model for audio separation

May 09, 2023
Pradyumna Reddy, Scott Wisdom, Klaus Greff, John R. Hershey, Thomas Kipf

Figure 1 for AudioSlots: A slot-centric generative model for audio separation
Figure 2 for AudioSlots: A slot-centric generative model for audio separation
Figure 3 for AudioSlots: A slot-centric generative model for audio separation
Viaarxiv icon

MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement

Sep 15, 2022
Jianrong Wang, Xiaomin Li, Xuewei Li, Mei Yu, Qiang Fang, Li Liu

Figure 1 for MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Figure 2 for MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Figure 3 for MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Figure 4 for MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Viaarxiv icon

FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis

Add code
Bookmark button
Alert button
Oct 27, 2022
Yifan Hu, Rui Liu, Guanglai Gao, Haizhou Li

Figure 1 for FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis
Figure 2 for FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis
Figure 3 for FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis
Figure 4 for FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis
Viaarxiv icon

Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection

Oct 26, 2022
Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari

Viaarxiv icon

Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition

Oct 14, 2022
Jakob Poncelet, Hugo Van hamme

Figure 1 for Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition
Figure 2 for Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition
Figure 3 for Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition
Figure 4 for Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition
Viaarxiv icon

A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference

Add code
Bookmark button
Alert button
Oct 27, 2022
Li-Wei Chen, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

Figure 1 for A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference
Figure 2 for A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference
Figure 3 for A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference
Viaarxiv icon

Automatic Severity Assessment of Dysarthric speech by using Self-supervised Model with Multi-task Learning

Add code
Bookmark button
Alert button
Oct 27, 2022
Eun Jung Yeo, Kwanghee Choi, Sunhee Kim, Minhwa Chung

Figure 1 for Automatic Severity Assessment of Dysarthric speech by using Self-supervised Model with Multi-task Learning
Figure 2 for Automatic Severity Assessment of Dysarthric speech by using Self-supervised Model with Multi-task Learning
Figure 3 for Automatic Severity Assessment of Dysarthric speech by using Self-supervised Model with Multi-task Learning
Figure 4 for Automatic Severity Assessment of Dysarthric speech by using Self-supervised Model with Multi-task Learning
Viaarxiv icon

CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement

Add code
Bookmark button
Alert button
Sep 23, 2022
Sherif Abdulatif, Ruizhe Cao, Bin Yang

Figure 1 for CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Figure 2 for CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Figure 3 for CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Figure 4 for CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Viaarxiv icon