Alert button

"speech recognition": models, code, and papers
Alert button

What has LeBenchmark Learnt about French Syntax?

Mar 04, 2024
Zdravko Dugonjić, Adrien Pupier, Benjamin Lecouteux, Maximin Coavoux

Figure 1 for What has LeBenchmark Learnt about French Syntax?
Figure 2 for What has LeBenchmark Learnt about French Syntax?
Figure 3 for What has LeBenchmark Learnt about French Syntax?
Figure 4 for What has LeBenchmark Learnt about French Syntax?
Viaarxiv icon

SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition

Jan 18, 2024
Hao Wang, Shuhei Kurita, Shuichiro Shimizu, Daisuke Kawahara

Viaarxiv icon

Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR

Add code
Bookmark button
Alert button
Mar 11, 2024
Yufeng Yang, Ashutosh Pandey, DeLiang Wang

Figure 1 for Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
Figure 2 for Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
Figure 3 for Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
Figure 4 for Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
Viaarxiv icon

SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations

Mar 10, 2024
Amit Meghanani, Thomas Hain

Figure 1 for SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations
Figure 2 for SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations
Figure 3 for SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations
Viaarxiv icon

Two-pass Endpoint Detection for Speech Recognition

Jan 17, 2024
Anirudh Raju, Aparna Khare, Di He, Ilya Sklyar, Long Chen, Sam Alptekin, Viet Anh Trinh, Zhe Zhang, Colin Vaz, Venkatesh Ravichandran, Roland Maas, Ariya Rastrow

Viaarxiv icon

AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models

Mar 05, 2024
Kazuki Kawamura, Jun Rekimoto

Figure 1 for AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models
Figure 2 for AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models
Figure 3 for AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models
Figure 4 for AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models
Viaarxiv icon

ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge

Jan 07, 2024
He Wang, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, Binbin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li

Viaarxiv icon

Exploring the limits of decoder-only models trained on public speech recognition corpora

Add code
Bookmark button
Alert button
Jan 31, 2024
Ankit Gupta, George Saon, Brian Kingsbury

Viaarxiv icon

Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition

Dec 18, 2023
Peng Shen, Xugang Lu, Hisashi Kawai

Viaarxiv icon

Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens

Feb 03, 2024
Nay San, Georgios Paraskevopoulos, Aryaman Arora, Xiluo He, Prabhjot Kaur, Oliver Adams, Dan Jurafsky

Viaarxiv icon