Alert button

"speech recognition": models, code, and papers
Alert button

Adapting an Unadaptable ASR System

Jun 01, 2023
Rao Ma, Mengjie Qian, Mark J. F. Gales, Kate M. Knill

Figure 1 for Adapting an Unadaptable ASR System
Figure 2 for Adapting an Unadaptable ASR System
Figure 3 for Adapting an Unadaptable ASR System
Figure 4 for Adapting an Unadaptable ASR System
Viaarxiv icon

Personalized Predictive ASR for Latency Reduction in Voice Assistants

May 23, 2023
Andreas Schwarz, Di He, Maarten Van Segbroeck, Mohammed Hethnawi, Ariya Rastrow

Figure 1 for Personalized Predictive ASR for Latency Reduction in Voice Assistants
Figure 2 for Personalized Predictive ASR for Latency Reduction in Voice Assistants
Figure 3 for Personalized Predictive ASR for Latency Reduction in Voice Assistants
Figure 4 for Personalized Predictive ASR for Latency Reduction in Voice Assistants
Viaarxiv icon

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems

Add code
Bookmark button
Alert button
Feb 15, 2023
Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Guinan Li, Shujie Hu, Xunying Liu

Figure 1 for Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Figure 2 for Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Figure 3 for Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Figure 4 for Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Viaarxiv icon

Employing Hybrid Deep Neural Networks on Dari Speech

May 04, 2023
Jawid Ahmad Baktash, Mursal Dawodi

Figure 1 for Employing Hybrid Deep Neural Networks on Dari Speech
Figure 2 for Employing Hybrid Deep Neural Networks on Dari Speech
Figure 3 for Employing Hybrid Deep Neural Networks on Dari Speech
Figure 4 for Employing Hybrid Deep Neural Networks on Dari Speech
Viaarxiv icon

Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition

Add code
Bookmark button
Alert button
Feb 18, 2023
Xie Chen, Ziyang Ma, Changli Tang, Yujin Wang, Zhisheng Zheng

Figure 1 for Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition
Figure 2 for Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition
Figure 3 for Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition
Figure 4 for Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition
Viaarxiv icon

Personalization of CTC Speech Recognition Models

Oct 18, 2022
Saket Dingliwal, Monica Sunkara, Srikanth Ronanki, Jeff Farris, Katrin Kirchhoff, Sravan Bodapati

Figure 1 for Personalization of CTC Speech Recognition Models
Figure 2 for Personalization of CTC Speech Recognition Models
Figure 3 for Personalization of CTC Speech Recognition Models
Figure 4 for Personalization of CTC Speech Recognition Models
Viaarxiv icon

A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model

May 19, 2023
Ibrahim Malik, Siddique Latif, Raja Jurdak, Björn Schuller

Figure 1 for A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
Figure 2 for A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
Figure 3 for A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
Figure 4 for A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
Viaarxiv icon

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Add code
Bookmark button
Alert button
Jul 06, 2023
Guinan Li, Jiajun Deng, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Mingyu Cui, Helen Meng, Xunying Liu

Figure 1 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 2 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 3 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 4 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Viaarxiv icon

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

Add code
Bookmark button
Alert button
Jul 14, 2023
Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Matthew Maciejewski, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur

Figure 1 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 2 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 3 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 4 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Viaarxiv icon

SGGNet$^2$: Speech-Scene Graph Grounding Network for Speech-guided Navigation

Jul 14, 2023
Dohyun Kim, Yeseung Kim, Jaehwi Jang, Minjae Song, Woojin Choi, Daehyung Park

Figure 1 for SGGNet$^2$: Speech-Scene Graph Grounding Network for Speech-guided Navigation
Figure 2 for SGGNet$^2$: Speech-Scene Graph Grounding Network for Speech-guided Navigation
Figure 3 for SGGNet$^2$: Speech-Scene Graph Grounding Network for Speech-guided Navigation
Figure 4 for SGGNet$^2$: Speech-Scene Graph Grounding Network for Speech-guided Navigation
Viaarxiv icon