Alert button

"speech": models, code, and papers
Alert button

Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks

Add code
Bookmark button
Alert button
Sep 22, 2022
Cassia Valentini-Botinhao, Manuel Sam Ribeiro, Oliver Watts, Korin Richmond, Gustav Eje Henter

Figure 1 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 2 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 3 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 4 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Viaarxiv icon

Towards the evaluation of simultaneous speech translation from a communicative perspective

Mar 15, 2021
claudio Fantinuoli, Bianca Prandi

Figure 1 for Towards the evaluation of simultaneous speech translation from a communicative perspective
Figure 2 for Towards the evaluation of simultaneous speech translation from a communicative perspective
Figure 3 for Towards the evaluation of simultaneous speech translation from a communicative perspective
Viaarxiv icon

UserLibri: A Dataset for ASR Personalization Using Only Text

Jul 02, 2022
Theresa Breiner, Swaroop Ramaswamy, Ehsan Variani, Shefali Garg, Rajiv Mathews, Khe Chai Sim, Kilol Gupta, Mingqing Chen, Lara McConnaughey

Figure 1 for UserLibri: A Dataset for ASR Personalization Using Only Text
Figure 2 for UserLibri: A Dataset for ASR Personalization Using Only Text
Figure 3 for UserLibri: A Dataset for ASR Personalization Using Only Text
Figure 4 for UserLibri: A Dataset for ASR Personalization Using Only Text
Viaarxiv icon

End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks

Oct 07, 2021
Navin Raj Prabhu, Guillaume Carbajal, Nale Lehmann-Willenbrock, Timo Gerkmann

Figure 1 for End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks
Figure 2 for End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks
Figure 3 for End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks
Figure 4 for End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks
Viaarxiv icon

FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals

Feb 11, 2022
Vijay Ravi, Jinhan Wang, Jonathan Flint, Abeer Alwan

Figure 1 for FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals
Figure 2 for FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals
Figure 3 for FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals
Figure 4 for FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals
Viaarxiv icon

ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection

Add code
Bookmark button
Alert button
Sep 01, 2021
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Héctor Delgado

Figure 1 for ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
Figure 2 for ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
Figure 3 for ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
Figure 4 for ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
Viaarxiv icon

Noise Classification Aided Attention-Based Neural Network for Monaural Speech Enhancement

Add code
Bookmark button
Alert button
May 31, 2021
Lu Ma, Song Yang, Yaguang Gong, Zhongqin Wu

Figure 1 for Noise Classification Aided Attention-Based Neural Network for Monaural Speech Enhancement
Figure 2 for Noise Classification Aided Attention-Based Neural Network for Monaural Speech Enhancement
Figure 3 for Noise Classification Aided Attention-Based Neural Network for Monaural Speech Enhancement
Figure 4 for Noise Classification Aided Attention-Based Neural Network for Monaural Speech Enhancement
Viaarxiv icon

Generalized Representations Learning for Time Series Classification

Add code
Bookmark button
Alert button
Sep 15, 2022
Wang Lu, Jindong Wang, Xinwei Sun, Yiqiang Chen, Xing Xie

Figure 1 for Generalized Representations Learning for Time Series Classification
Figure 2 for Generalized Representations Learning for Time Series Classification
Figure 3 for Generalized Representations Learning for Time Series Classification
Figure 4 for Generalized Representations Learning for Time Series Classification
Viaarxiv icon

EMA2S: An End-to-End Multimodal Articulatory-to-Speech System

Feb 07, 2021
Yu-Wen Chen, Kuo-Hsuan Hung, Shang-Yi Chuang, Jonathan Sherman, Wen-Chin Huang, Xugang Lu, Yu Tsao

Figure 1 for EMA2S: An End-to-End Multimodal Articulatory-to-Speech System
Figure 2 for EMA2S: An End-to-End Multimodal Articulatory-to-Speech System
Figure 3 for EMA2S: An End-to-End Multimodal Articulatory-to-Speech System
Figure 4 for EMA2S: An End-to-End Multimodal Articulatory-to-Speech System
Viaarxiv icon

Audio-Visual Speech Inpainting with Deep Learning

Add code
Bookmark button
Alert button
Oct 09, 2020
Giovanni Morrone, Daniel Michelsanti, Zheng-Hua Tan, Jesper Jensen

Figure 1 for Audio-Visual Speech Inpainting with Deep Learning
Figure 2 for Audio-Visual Speech Inpainting with Deep Learning
Figure 3 for Audio-Visual Speech Inpainting with Deep Learning
Viaarxiv icon