Alert button

"speech recognition": models, code, and papers
Alert button

A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge

May 02, 2023
Siddhant Arora, Hayato Futami, Shih-Lun Wu, Jessica Huynh, Yifan Peng, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe

Figure 1 for A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge
Figure 2 for A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge
Figure 3 for A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge
Viaarxiv icon

Critical Appraisal of Artificial Intelligence-Mediated Communication

May 15, 2023
Dara Tafazoli

Viaarxiv icon

Conformers are All You Need for Visual Speech Recogntion

Feb 17, 2023
Oscar Chang, Hank Liao, Dmitriy Serdyuk, Ankit Shah, Olivier Siohan

Figure 1 for Conformers are All You Need for Visual Speech Recogntion
Figure 2 for Conformers are All You Need for Visual Speech Recogntion
Figure 3 for Conformers are All You Need for Visual Speech Recogntion
Figure 4 for Conformers are All You Need for Visual Speech Recogntion
Viaarxiv icon

Deep Speech Based End-to-End Automated Speech Recognition (ASR) for Indian-English Accents

Apr 03, 2022
Priyank Dubey, Bilal Shah

Viaarxiv icon

SE-Bridge: Speech Enhancement with Consistent Brownian Bridge

May 23, 2023
Zhibin Qiu, Mengfan Fu, Fuchun Sun, Gulila Altenbek, Hao Huang

Figure 1 for SE-Bridge: Speech Enhancement with Consistent Brownian Bridge
Figure 2 for SE-Bridge: Speech Enhancement with Consistent Brownian Bridge
Figure 3 for SE-Bridge: Speech Enhancement with Consistent Brownian Bridge
Figure 4 for SE-Bridge: Speech Enhancement with Consistent Brownian Bridge
Viaarxiv icon

Understanding Shared Speech-Text Representations

Apr 27, 2023
Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang

Figure 1 for Understanding Shared Speech-Text Representations
Figure 2 for Understanding Shared Speech-Text Representations
Figure 3 for Understanding Shared Speech-Text Representations
Figure 4 for Understanding Shared Speech-Text Representations
Viaarxiv icon

Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication

Jun 01, 2023
Emin Cagatay Nakilcioglu, Maximilian Reimann, Ole John

Viaarxiv icon

Performance Disparities Between Accents in Automatic Speech Recognition

Aug 01, 2022
Alex DiChristofano, Henry Shuster, Shefali Chandra, Neal Patwari

Figure 1 for Performance Disparities Between Accents in Automatic Speech Recognition
Figure 2 for Performance Disparities Between Accents in Automatic Speech Recognition
Figure 3 for Performance Disparities Between Accents in Automatic Speech Recognition
Figure 4 for Performance Disparities Between Accents in Automatic Speech Recognition
Viaarxiv icon

End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders

May 04, 2023
Jixuan Wang, Martin Radfar, Kai Wei, Clement Chung

Figure 1 for End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders
Figure 2 for End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders
Figure 3 for End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders
Figure 4 for End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders
Viaarxiv icon

Heterogeneous Reservoir Computing Models for Persian Speech Recognition

May 25, 2022
Zohreh Ansari, Farzin Pourhoseini, Fatemeh Hadaeghi

Figure 1 for Heterogeneous Reservoir Computing Models for Persian Speech Recognition
Figure 2 for Heterogeneous Reservoir Computing Models for Persian Speech Recognition
Figure 3 for Heterogeneous Reservoir Computing Models for Persian Speech Recognition
Figure 4 for Heterogeneous Reservoir Computing Models for Persian Speech Recognition
Viaarxiv icon