Alert button

"speech": models, code, and papers
Alert button

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech

Nov 04, 2022
Xin Zhang, Iván Vallés-Pérez, Andreas Stolcke, Chengzhu Yu, Jasha Droppo, Olabanji Shonibare, Roberto Barra-Chicote, Venkatesh Ravichandran

Figure 1 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 2 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 3 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 4 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Viaarxiv icon

Improving Autoregressive NLP Tasks via Modular Linearized Attention

Apr 17, 2023
Victor Agostinelli, Lizhong Chen

Figure 1 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 2 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 3 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 4 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Viaarxiv icon

LongFNT: Long-form Speech Recognition with Factorized Neural Transducer

Nov 17, 2022
Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian

Figure 1 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 2 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 3 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 4 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Viaarxiv icon

Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation

Feb 21, 2023
Biao Zhang, Barry Haddow, Rico Sennrich

Figure 1 for Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Figure 2 for Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Figure 3 for Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Figure 4 for Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Viaarxiv icon

AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuning

May 30, 2023
John Bosco Mugeni, Steven Lynden, Toshiyuki Amagasa, Akiyoshi Matono

Viaarxiv icon

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Mar 29, 2023
Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li

Figure 1 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 2 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 3 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 4 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Viaarxiv icon

Rethinking complex-valued deep neural networks for monaural speech enhancement

Jan 11, 2023
Haibin Wu, Ke Tan, Buye Xu, Anurag Kumar, Daniel Wong

Figure 1 for Rethinking complex-valued deep neural networks for monaural speech enhancement
Figure 2 for Rethinking complex-valued deep neural networks for monaural speech enhancement
Figure 3 for Rethinking complex-valued deep neural networks for monaural speech enhancement
Figure 4 for Rethinking complex-valued deep neural networks for monaural speech enhancement
Viaarxiv icon

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

Apr 13, 2023
Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg

Figure 1 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 2 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 3 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 4 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Viaarxiv icon

Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition

Feb 28, 2023
Zhijie Shen, Wu Guo, Bin Gu

Figure 1 for Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition
Figure 2 for Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition
Figure 3 for Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition
Figure 4 for Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition
Viaarxiv icon

Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement

Nov 22, 2022
Xiaofeng Ge, Jiangyu Han, Haixin Guan, Yanhua Long

Figure 1 for Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Figure 2 for Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Figure 3 for Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Figure 4 for Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Viaarxiv icon