Alert button

"speech recognition": models, code, and papers
Alert button

Device Directedness with Contextual Cues for Spoken Dialog Systems

Nov 23, 2022
Dhanush Bekal, Sundararajan Srinivasan, Sravan Bodapati, Srikanth Ronanki, Katrin Kirchhoff

Figure 1 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 2 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 3 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 4 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Viaarxiv icon

Speaker Anonymization with Phonetic Intermediate Representations

Add code
Bookmark button
Alert button
Jul 11, 2022
Sarina Meyer, Florian Lux, Pavel Denisov, Julia Koch, Pascal Tilli, Ngoc Thang Vu

Figure 1 for Speaker Anonymization with Phonetic Intermediate Representations
Figure 2 for Speaker Anonymization with Phonetic Intermediate Representations
Figure 3 for Speaker Anonymization with Phonetic Intermediate Representations
Figure 4 for Speaker Anonymization with Phonetic Intermediate Representations
Viaarxiv icon

Cross-sentence Neural Language Models for Conversational Speech Recognition

Jun 15, 2021
Shih-Hsuan Chiu, Tien-Hong Lo, Berlin Chen

Figure 1 for Cross-sentence Neural Language Models for Conversational Speech Recognition
Figure 2 for Cross-sentence Neural Language Models for Conversational Speech Recognition
Figure 3 for Cross-sentence Neural Language Models for Conversational Speech Recognition
Figure 4 for Cross-sentence Neural Language Models for Conversational Speech Recognition
Viaarxiv icon

Exploiting Adapters for Cross-lingual Low-resource Speech Recognition

Add code
Bookmark button
Alert button
May 18, 2021
Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki

Figure 1 for Exploiting Adapters for Cross-lingual Low-resource Speech Recognition
Figure 2 for Exploiting Adapters for Cross-lingual Low-resource Speech Recognition
Figure 3 for Exploiting Adapters for Cross-lingual Low-resource Speech Recognition
Figure 4 for Exploiting Adapters for Cross-lingual Low-resource Speech Recognition
Viaarxiv icon

Simulating realistic speech overlaps improves multi-talker ASR

Nov 17, 2022
Muqiao Yang, Naoyuki Kanda, Xiaofei Wang, Jian Wu, Sunit Sivasankaran, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Figure 1 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 2 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 3 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 4 for Simulating realistic speech overlaps improves multi-talker ASR
Viaarxiv icon

A Simple Baseline for Domain Adaptation in End to End ASR Systems Using Synthetic Data

Jun 22, 2022
Raviraj Joshi, Anupam Singh

Figure 1 for A Simple Baseline for Domain Adaptation in End to End ASR Systems Using Synthetic Data
Figure 2 for A Simple Baseline for Domain Adaptation in End to End ASR Systems Using Synthetic Data
Figure 3 for A Simple Baseline for Domain Adaptation in End to End ASR Systems Using Synthetic Data
Viaarxiv icon

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

Oct 27, 2022
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

Figure 1 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 2 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 3 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 4 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Viaarxiv icon

Fusing ASR Outputs in Joint Training for Speech Emotion Recognition

Oct 29, 2021
Yuanchao Li, Peter Bell, Catherine Lai

Figure 1 for Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
Figure 2 for Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
Figure 3 for Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
Figure 4 for Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
Viaarxiv icon

Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech

Add code
Bookmark button
Alert button
May 08, 2021
Aashish Agarwal, Torsten Zesch

Figure 1 for Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech
Figure 2 for Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech
Figure 3 for Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech
Figure 4 for Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech
Viaarxiv icon

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Oct 01, 2021
Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu

Figure 1 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 2 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 3 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 4 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Viaarxiv icon