Alert button

"speech": models, code, and papers
Alert button

The impact of removing head movements on audio-visual speech enhancement

Feb 02, 2022
Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda, Jacob Donley, Anurag Kumar

Figure 1 for The impact of removing head movements on audio-visual speech enhancement
Figure 2 for The impact of removing head movements on audio-visual speech enhancement
Figure 3 for The impact of removing head movements on audio-visual speech enhancement
Figure 4 for The impact of removing head movements on audio-visual speech enhancement
Viaarxiv icon

Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning

Add code
Bookmark button
Alert button
Oct 15, 2021
Toshiko Shibano, Xinyi Zhang, Mia Taige Li, Haejin Cho, Peter Sullivan, Muhammad Abdul-Mageed

Figure 1 for Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning
Figure 2 for Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning
Figure 3 for Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning
Figure 4 for Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning
Viaarxiv icon

Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss

Mar 02, 2021
Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Shota Orihashi, Ryo Masumura

Figure 1 for Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Figure 2 for Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Figure 3 for Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Figure 4 for Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Viaarxiv icon

Code-Switching Text Augmentation for Multilingual Speech Processing

Add code
Bookmark button
Alert button
Jan 07, 2022
Amir Hussein, Shammur Absar Chowdhury, Ahmed Abdelali, Najim Dehak, Ahmed Ali

Figure 1 for Code-Switching Text Augmentation for Multilingual Speech Processing
Figure 2 for Code-Switching Text Augmentation for Multilingual Speech Processing
Figure 3 for Code-Switching Text Augmentation for Multilingual Speech Processing
Figure 4 for Code-Switching Text Augmentation for Multilingual Speech Processing
Viaarxiv icon

Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement

May 19, 2021
Guillaume Carbajal, Julius Richter, Timo Gerkmann

Figure 1 for Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement
Figure 2 for Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement
Figure 3 for Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement
Figure 4 for Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement
Viaarxiv icon

Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation

Add code
Bookmark button
Alert button
Oct 19, 2021
Fengyu Yang, Jian Luan, Yujun Wang

Figure 1 for Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation
Figure 2 for Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation
Figure 3 for Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation
Figure 4 for Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation
Viaarxiv icon

Streaming Multi-talker Speech Recognition with Joint Speaker Identification

Add code
Bookmark button
Alert button
Apr 05, 2021
Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong

Figure 1 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 2 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 3 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 4 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Viaarxiv icon

Towards Cross-speaker Reading Style Transfer on Audiobook Dataset

Add code
Bookmark button
Alert button
Aug 19, 2022
Xiang Li, Changhe Song, Xianhao Wei, Zhiyong Wu, Jia Jia, Helen Meng

Figure 1 for Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Figure 2 for Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Figure 3 for Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Figure 4 for Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Viaarxiv icon

Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training

Add code
Bookmark button
Alert button
Oct 21, 2020
Renjie Zheng, Mingbo Ma, Baigong Zheng, Kaibo Liu, Jiahong Yuan, Kenneth Church, Liang Huang

Figure 1 for Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Figure 2 for Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Figure 3 for Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Figure 4 for Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Viaarxiv icon