Alert button

"speech": models, code, and papers
Alert button

SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech

Add code
Bookmark button
Alert button
Nov 14, 2022
Perry Lam, Huayun Zhang, Nancy F. Chen, Berrak Sisman, Dorien Herremans

Figure 1 for SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech
Figure 2 for SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech
Figure 3 for SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech
Figure 4 for SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech
Viaarxiv icon

Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations

Add code
Bookmark button
Alert button
Aug 10, 2022
Jaejin Cho, Raghavendra Pappagari, Piotr Żelasko, Laureano Moro-Velazquez, Jesús Villalba, Najim Dehak

Figure 1 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 2 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 3 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 4 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Viaarxiv icon

VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition

Sep 12, 2022
Naoyuki Kanda, Jian Wu, Xiaofei Wang, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Figure 1 for VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Figure 2 for VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Figure 3 for VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Figure 4 for VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Viaarxiv icon

Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning

Add code
Bookmark button
Alert button
Mar 21, 2023
Sung-Feng Huang, Chia-ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, Hung-yi Lee

Figure 1 for Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Figure 2 for Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Figure 3 for Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Figure 4 for Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Viaarxiv icon

TTS-Guided Training for Accent Conversion Without Parallel Data

Add code
Bookmark button
Alert button
Dec 20, 2022
Yi Zhou, Zhizheng Wu, Mingyang Zhang, Xiaohai Tian, Haizhou Li

Figure 1 for TTS-Guided Training for Accent Conversion Without Parallel Data
Figure 2 for TTS-Guided Training for Accent Conversion Without Parallel Data
Figure 3 for TTS-Guided Training for Accent Conversion Without Parallel Data
Figure 4 for TTS-Guided Training for Accent Conversion Without Parallel Data
Viaarxiv icon

Semantic-preserved Communication System for Highly Efficient Speech Transmission

Add code
Bookmark button
Alert button
May 25, 2022
Tianxiao Han, Qianqian Yang, Zhiguo Shi, Shibo He, Zhaoyang Zhang

Figure 1 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 2 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 3 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 4 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Viaarxiv icon

RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling

Aug 12, 2022
Jian Liao, Adnan Karim, Shivesh Jadon, Rubaiat Habib Kazi, Ryo Suzuki

Figure 1 for RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling
Figure 2 for RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling
Figure 3 for RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling
Figure 4 for RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling
Viaarxiv icon

Improving Massively Multilingual ASR With Auxiliary CTC Objectives

Add code
Bookmark button
Alert button
Feb 27, 2023
William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe

Figure 1 for Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Figure 2 for Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Figure 3 for Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Figure 4 for Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Viaarxiv icon

Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need

Jul 02, 2022
Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Bozena Kostek

Figure 1 for Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need
Figure 2 for Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need
Figure 3 for Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need
Figure 4 for Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need
Viaarxiv icon

Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP

Add code
Bookmark button
Alert button
Mar 31, 2023
Sara Papi, Marco Gaido, Andrea Pilzer, Matteo Negri

Figure 1 for Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP
Figure 2 for Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP
Figure 3 for Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP
Figure 4 for Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP
Viaarxiv icon