Alert button

"speech recognition": models, code, and papers
Alert button

Bidirectional Representations for Low Resource Spoken Language Understanding

Nov 24, 2022
Quentin Meeus, Marie-Francine Moens, Hugo Van hamme

Figure 1 for Bidirectional Representations for Low Resource Spoken Language Understanding
Figure 2 for Bidirectional Representations for Low Resource Spoken Language Understanding
Figure 3 for Bidirectional Representations for Low Resource Spoken Language Understanding
Figure 4 for Bidirectional Representations for Low Resource Spoken Language Understanding
Viaarxiv icon

Biased Self-supervised learning for ASR

Nov 04, 2022
Florian L. Kreyssig, Yangyang Shi, Jinxi Guo, Leda Sari, Abdelrahman Mohamed, Philip C. Woodland

Figure 1 for Biased Self-supervised learning for ASR
Figure 2 for Biased Self-supervised learning for ASR
Figure 3 for Biased Self-supervised learning for ASR
Viaarxiv icon

End-to-end Audio-visual Speech Recognition with Conformers

Feb 12, 2021
Pingchuan Ma, Stavros Petridis, Maja Pantic

Figure 1 for End-to-end Audio-visual Speech Recognition with Conformers
Figure 2 for End-to-end Audio-visual Speech Recognition with Conformers
Figure 3 for End-to-end Audio-visual Speech Recognition with Conformers
Figure 4 for End-to-end Audio-visual Speech Recognition with Conformers
Viaarxiv icon

SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training

Add code
Bookmark button
Alert button
Oct 07, 2022
Ziqiang Zhang, Long Zhou, Junyi Ao, Shujie Liu, Lirong Dai, Jinyu Li, Furu Wei

Figure 1 for SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Figure 2 for SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Figure 3 for SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Figure 4 for SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Viaarxiv icon

Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation

Add code
Bookmark button
Alert button
Nov 02, 2020
Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier

Figure 1 for Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation
Figure 2 for Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation
Figure 3 for Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation
Figure 4 for Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation
Viaarxiv icon

Robust Speech Recognition Using Generative Adversarial Networks

Nov 05, 2017
Anuroop Sriram, Heewoo Jun, Yashesh Gaur, Sanjeev Satheesh

Figure 1 for Robust Speech Recognition Using Generative Adversarial Networks
Figure 2 for Robust Speech Recognition Using Generative Adversarial Networks
Figure 3 for Robust Speech Recognition Using Generative Adversarial Networks
Figure 4 for Robust Speech Recognition Using Generative Adversarial Networks
Viaarxiv icon

Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding

Oct 23, 2021
Wei Wang, Shuo Ren, Yao Qian, Shujie Liu, Yu Shi, Yanmin Qian, Michael Zeng

Figure 1 for Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding
Figure 2 for Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding
Figure 3 for Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding
Figure 4 for Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding
Viaarxiv icon

LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition

Oct 21, 2020
Xie Chen, Sarangarajan Parthasarathy, William Gale, Shuangyu Chang, Michael Zeng

Figure 1 for LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
Figure 2 for LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
Figure 3 for LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
Figure 4 for LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
Viaarxiv icon

More Speaking or More Speakers?

Add code
Bookmark button
Alert button
Nov 02, 2022
Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko

Figure 1 for More Speaking or More Speakers?
Figure 2 for More Speaking or More Speakers?
Figure 3 for More Speaking or More Speakers?
Figure 4 for More Speaking or More Speakers?
Viaarxiv icon

Simulating realistic speech overlaps improves multi-talker ASR

Oct 27, 2022
Muqiao Yang, Naoyuki Kanda, Xiaofei Wang, Jian Wu, Sunit Sivasankaran, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Figure 1 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 2 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 3 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 4 for Simulating realistic speech overlaps improves multi-talker ASR
Viaarxiv icon