Alert button

"speech": models, code, and papers
Alert button

A Unified Deep Speaker Embedding Framework for Mixed-Bandwidth Speech Data

Dec 01, 2020
Weicheng Cai, Ming Li

Figure 1 for A Unified Deep Speaker Embedding Framework for Mixed-Bandwidth Speech Data
Figure 2 for A Unified Deep Speaker Embedding Framework for Mixed-Bandwidth Speech Data
Figure 3 for A Unified Deep Speaker Embedding Framework for Mixed-Bandwidth Speech Data
Figure 4 for A Unified Deep Speaker Embedding Framework for Mixed-Bandwidth Speech Data
Viaarxiv icon

Federated Learning with Dynamic Transformer for Text to Speech

Jul 09, 2021
Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Jie Liu, Chendong Zhao, Jing Xiao

Figure 1 for Federated Learning with Dynamic Transformer for Text to Speech
Figure 2 for Federated Learning with Dynamic Transformer for Text to Speech
Figure 3 for Federated Learning with Dynamic Transformer for Text to Speech
Figure 4 for Federated Learning with Dynamic Transformer for Text to Speech
Viaarxiv icon

MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech

May 12, 2020
Jakob D. Havtorn, Jan Latko, Joakim Edin, Lasse Borgholt, Lars Maaløe, Lorenzo Belgrano, Nicolai F. Jacobsen, Regitze Sdun, Željko Agić

Figure 1 for MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
Figure 2 for MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
Figure 3 for MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
Figure 4 for MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
Viaarxiv icon

Self supervised learning for robust voice cloning

Add code
Bookmark button
Alert button
Apr 07, 2022
Konstantinos Klapsas, Nikolaos Ellinas, Karolos Nikitaras, Georgios Vamvoukakis, Panos Kakoulidis, Konstantinos Markopoulos, Spyros Raptis, June Sig Sung, Gunu Jho, Aimilios Chalamandaris, Pirros Tsiakoulis

Figure 1 for Self supervised learning for robust voice cloning
Figure 2 for Self supervised learning for robust voice cloning
Figure 3 for Self supervised learning for robust voice cloning
Viaarxiv icon

Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception

Jul 26, 2022
Rui Duan, Zhe Qu, Shangqing Zhao, Leah Ding, Yao Liu, Zhuo Lu

Figure 1 for Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception
Figure 2 for Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception
Figure 3 for Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception
Figure 4 for Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception
Viaarxiv icon

Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing

Mar 29, 2022
Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata

Figure 1 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 2 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 3 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 4 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Viaarxiv icon

Detecting Adversarial Examples in Batches -- a geometrical approach

Add code
Bookmark button
Alert button
Jun 17, 2022
Danush Kumar Venkatesh, Peter Steinbach

Figure 1 for Detecting Adversarial Examples in Batches -- a geometrical approach
Figure 2 for Detecting Adversarial Examples in Batches -- a geometrical approach
Figure 3 for Detecting Adversarial Examples in Batches -- a geometrical approach
Figure 4 for Detecting Adversarial Examples in Batches -- a geometrical approach
Viaarxiv icon

Pseudo Label Is Better Than Human Label

Mar 28, 2022
Dongseong Hwang, Khe Chai Sim, Zhouyuan Huo, Trevor Strohman

Figure 1 for Pseudo Label Is Better Than Human Label
Figure 2 for Pseudo Label Is Better Than Human Label
Figure 3 for Pseudo Label Is Better Than Human Label
Figure 4 for Pseudo Label Is Better Than Human Label
Viaarxiv icon

Distance Learner: Incorporating Manifold Prior to Model Training

Add code
Bookmark button
Alert button
Jul 14, 2022
Aditya Chetan, Nipun Kwatra

Figure 1 for Distance Learner: Incorporating Manifold Prior to Model Training
Figure 2 for Distance Learner: Incorporating Manifold Prior to Model Training
Figure 3 for Distance Learner: Incorporating Manifold Prior to Model Training
Figure 4 for Distance Learner: Incorporating Manifold Prior to Model Training
Viaarxiv icon

VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention

Add code
Bookmark button
Alert button
Feb 12, 2021
Peng Liu, Yuewen Cao, Songxiang Liu, Na Hu, Guangzhi Li, Chao Weng, Dan Su

Figure 1 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 2 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 3 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 4 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Viaarxiv icon