Alert button

"speech recognition": models, code, and papers
Alert button

Investigation of Adapter for Automatic Speech Recognition in Noisy Environment

Add code
Bookmark button
Alert button
Feb 29, 2024
Hao Shi, Tatsuya Kawahara

Viaarxiv icon

Kallaama: A Transcribed Speech Dataset about Agriculture in the Three Most Widely Spoken Languages in Senegal

Apr 02, 2024
Elodie Gauthier, Aminata Ndiaye, Abdoulaye Guissé

Viaarxiv icon

Multilingual Speech Models for Automatic Speech Recognition Exhibit Gender Performance Gaps

Add code
Bookmark button
Alert button
Feb 28, 2024
Giuseppe Attanasio, Beatrice Savoldi, Dennis Fucci, Dirk Hovy

Viaarxiv icon

Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey

Mar 02, 2024
Hamza Kheddar, Mustapha Hemis, Yassine Himeur

Figure 1 for Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
Figure 2 for Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
Figure 3 for Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
Figure 4 for Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
Viaarxiv icon

What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions

Apr 10, 2024
Hanyu Meng, Vidhyasaharan Sethu, Eliathamby Ambikairajah

Viaarxiv icon

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Mar 07, 2024
Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee

Figure 1 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 2 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 3 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 4 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Viaarxiv icon

Houston we have a Divergence: A Subgroup Performance Analysis of ASR Models

Mar 31, 2024
Alkis Koudounas, Flavio Giobergia

Viaarxiv icon

LV-CTC: Non-autoregressive ASR with CTC and latent variable models

Mar 28, 2024
Yuya Fujita, Shinji Watanabe, Xuankai Chang, Takashi Maekaku

Viaarxiv icon

ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models

Add code
Bookmark button
Alert button
Mar 29, 2024
Thibaut Thonet, Jos Rozen, Laurent Besacier

Figure 1 for ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Figure 2 for ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Figure 3 for ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Figure 4 for ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Viaarxiv icon

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview

Mar 01, 2024
Heyang Liu, Yu Wang, Yanfeng Wang

Figure 1 for Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview
Figure 2 for Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview
Figure 3 for Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview
Figure 4 for Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview
Viaarxiv icon