Alert button

"speech recognition": models, code, and papers
Alert button

Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition

Oct 31, 2018
Chris Donahue, Bo Li, Rohit Prabhavalkar

Figure 1 for Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition
Figure 2 for Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition
Figure 3 for Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition
Figure 4 for Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition
Viaarxiv icon

Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages

Jun 18, 2018
Matthew Wiesner, Chunxi Liu, Lucas Ondel, Craig Harman, Vimal Manohar, Jan Trmal, Zhongqiang Huang, Najim Dehak, Sanjeev Khudanpur

Figure 1 for Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages
Figure 2 for Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages
Figure 3 for Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages
Figure 4 for Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages
Viaarxiv icon

Universal speaker recognition encoders for different speech segments duration

Oct 28, 2022
Sergey Novoselov, Vladimir Volokhov, Galina Lavrentyeva

Figure 1 for Universal speaker recognition encoders for different speech segments duration
Figure 2 for Universal speaker recognition encoders for different speech segments duration
Figure 3 for Universal speaker recognition encoders for different speech segments duration
Viaarxiv icon

Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation

Aug 16, 2021
Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram

Figure 1 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 2 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 3 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 4 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Viaarxiv icon

Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction

Oct 28, 2021
Heming Wang, Yao Qian, Xiaofei Wang, Yiming Wang, Chengyi Wang, Shujie Liu, Takuya Yoshioka, Jinyu Li, DeLiang Wang

Figure 1 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 2 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 3 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 4 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Viaarxiv icon

EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition

Mar 10, 2021
Maurice Gerczuk, Shahin Amiriparian, Sandra Ottl, Björn Schuller

Figure 1 for EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition
Figure 2 for EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition
Figure 3 for EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition
Figure 4 for EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition
Viaarxiv icon

Transformer-based Streaming ASR with Cumulative Attention

Mar 11, 2022
Mohan Li, Shucong Zhang, Catalin Zorila, Rama Doddipatla

Figure 1 for Transformer-based Streaming ASR with Cumulative Attention
Figure 2 for Transformer-based Streaming ASR with Cumulative Attention
Figure 3 for Transformer-based Streaming ASR with Cumulative Attention
Figure 4 for Transformer-based Streaming ASR with Cumulative Attention
Viaarxiv icon

Factorized Neural Transducer for Efficient Language Model Adaptation

Oct 07, 2021
Xie Chen, Zhong Meng, Sarangarajan Parthasarathy, Jinyu Li

Figure 1 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 2 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 3 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 4 for Factorized Neural Transducer for Efficient Language Model Adaptation
Viaarxiv icon

Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos

Jun 09, 2022
Alexander Waibel, Moritz Behr, Fevziye Irem Eyiokur, Dogucan Yaman, Tuan-Nam Nguyen, Carlos Mullov, Mehmet Arif Demirtas, Alperen Kantarcı, Stefan Constantin, Hazım Kemal Ekenel

Figure 1 for Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Figure 2 for Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Figure 3 for Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Figure 4 for Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Viaarxiv icon

Is Speech Emotion Recognition Language-Independent? Analysis of English and Bangla Languages using Language-Independent Vocal Features

Nov 21, 2021
Fardin Saad, Hasan Mahmud, Md. Alamin Shaheen, Md. Kamrul Hasan, Paresha Farastu

Figure 1 for Is Speech Emotion Recognition Language-Independent? Analysis of English and Bangla Languages using Language-Independent Vocal Features
Figure 2 for Is Speech Emotion Recognition Language-Independent? Analysis of English and Bangla Languages using Language-Independent Vocal Features
Figure 3 for Is Speech Emotion Recognition Language-Independent? Analysis of English and Bangla Languages using Language-Independent Vocal Features
Figure 4 for Is Speech Emotion Recognition Language-Independent? Analysis of English and Bangla Languages using Language-Independent Vocal Features
Viaarxiv icon