Alert button

"speech recognition": models, code, and papers
Alert button

AccentFold: A Journey through African Accents for Zero-Shot ASR Adaptation to Target Accents

Feb 05, 2024
Abraham Toluwase Owodunni, Aditya Yadavalli, Chris Chinenye Emezue, Tobi Olatunji, Clinton C Mbataku

Viaarxiv icon

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

Jan 11, 2024
Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech Recognition

Jan 04, 2024
Junfeng Hou, Peiyao Wang, Jincheng Zhang, Meng Yang, Minwei Feng, Jingcheng Yin

Viaarxiv icon

Self-consistent context aware conformer transducer for speech recognition

Feb 09, 2024
Konstantin Kolokolov, Pavel Pekichev, Karthik Raghunathan

Viaarxiv icon

LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data

Dec 15, 2023
Hendrik Laux, Emil Mededovic, Ahmed Hallawa, Lukas Martin, Arne Peine, Anke Schmeink

Figure 1 for LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data
Figure 2 for LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data
Figure 3 for LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data
Figure 4 for LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data
Viaarxiv icon

Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition

Feb 08, 2024
Lei Liu, Li Liu, Haizhou Li

Viaarxiv icon

The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese

Feb 12, 2024
Ajinkya Kulkarni, Anna Tokareva, Rameez Qureshi, Miguel Couceiro

Viaarxiv icon

Cross-Attention Fusion of Visual and Geometric Features for Large Vocabulary Arabic Lipreading

Add code
Bookmark button
Alert button
Feb 18, 2024
Samar Daou, Ahmed Rekik, Achraf Ben-Hamadou, Abdelaziz Kallel

Viaarxiv icon

How Paralingual are Paralinguistic Representations? A Case Study in Speech Emotion Recognition

Feb 02, 2024
Orchid Chetia Phukan, Gautam Siddharth Kashyap, Arun Balaji Buduru, Rajesh Sharma

Viaarxiv icon

Unified Speech-Text Pretraining for Spoken Dialog Modeling

Add code
Bookmark button
Alert button
Feb 08, 2024
Heeseung Kim, Soonshin Seo, Kyeongseok Jeong, Ohsung Kwon, Jungwhan Kim, Jaehong Lee, Eunwoo Song, Myungwoo Oh, Sungroh Yoon, Kang Min Yoo

Viaarxiv icon