Alert button

"speech": models, code, and papers
Alert button

ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction

Add code
Bookmark button
Alert button
Oct 08, 2023
Jiajun He, Zekun Yang, Tomoki Toda

Figure 1 for ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Figure 2 for ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Figure 3 for ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Figure 4 for ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Viaarxiv icon

ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading

Add code
Bookmark button
Alert button
Jul 03, 2023
Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee

Figure 1 for ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Figure 2 for ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Figure 3 for ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Figure 4 for ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Viaarxiv icon

StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

Add code
Bookmark button
Alert button
Jun 01, 2023
Kun Song, Yi Ren, Yi Lei, Chunfeng Wang, Kun Wei, Lei Xie, Xiang Yin, Zejun Ma

Figure 1 for StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation
Figure 2 for StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation
Figure 3 for StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation
Figure 4 for StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation
Viaarxiv icon

Contrastive Speaker Embedding With Sequential Disentanglement

Sep 23, 2023
Youzhi Tu, Man-Wai Mak, Jen-Tzung Chien

Viaarxiv icon

PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement

Jul 28, 2023
Xinmeng Xu, Weiping Tu, Yuhong Yang

Figure 1 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 2 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 3 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 4 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Aug 14, 2023
Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran

Viaarxiv icon

Text Injection for Capitalization and Turn-Taking Prediction in Speech Models

Aug 14, 2023
Shaan Bijwadia, Shuo-yiin Chang, Weiran Wang, Zhong Meng, Hao Zhang, Tara N. Sainath

Figure 1 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 2 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 3 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 4 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Viaarxiv icon

Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models

Add code
Bookmark button
Alert button
Sep 14, 2023
Ju-ho Kim, Jungwoo Heo, Hyun-seo Shin, Chan-yeong Lim, Ha-Jin Yu

Figure 1 for Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Figure 2 for Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Figure 3 for Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Figure 4 for Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Viaarxiv icon

Knowledge Distilled Ensemble Model for sEMG-based Silent Speech Interface

Add code
Bookmark button
Alert button
Aug 07, 2023
Wenqiang Lai, Qihan Yang, Ye Mao, Endong Sun, Jiangnan Ye

Viaarxiv icon

Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition

Jul 14, 2023
Wenxuan Wang, Guodong Ma, Yuke Li, Binbin Du

Figure 1 for Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Figure 2 for Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Figure 3 for Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Figure 4 for Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Viaarxiv icon