Alert button

"speech recognition": models, code, and papers
Alert button

Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages

Sep 08, 2022
Li Miao, Jian Wu, Piyush Behre, Shuangyu Chang, Sarangarajan Parthasarathy

Figure 1 for Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages
Figure 2 for Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages
Figure 3 for Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages
Figure 4 for Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages
Viaarxiv icon

Robustness of Multi-Source MT to Transcription Errors

Add code
Bookmark button
Alert button
May 26, 2023
Dominik Macháček, Peter Polák, Ondřej Bojar, Raj Dabre

Figure 1 for Robustness of Multi-Source MT to Transcription Errors
Figure 2 for Robustness of Multi-Source MT to Transcription Errors
Figure 3 for Robustness of Multi-Source MT to Transcription Errors
Figure 4 for Robustness of Multi-Source MT to Transcription Errors
Viaarxiv icon

Joint Speech Recognition and Audio Captioning

Add code
Bookmark button
Alert button
Feb 03, 2022
Chaitanya Narisetty, Emiru Tsunoo, Xuankai Chang, Yosuke Kashiwagi, Michael Hentschel, Shinji Watanabe

Figure 1 for Joint Speech Recognition and Audio Captioning
Figure 2 for Joint Speech Recognition and Audio Captioning
Figure 3 for Joint Speech Recognition and Audio Captioning
Figure 4 for Joint Speech Recognition and Audio Captioning
Viaarxiv icon

Multimodal Audio-textual Architecture for Robust Spoken Language Understanding

Jun 13, 2023
Anderson R. Avila, Mehdi Rezagholizadeh, Chao Xing

Figure 1 for Multimodal Audio-textual Architecture for Robust Spoken Language Understanding
Figure 2 for Multimodal Audio-textual Architecture for Robust Spoken Language Understanding
Figure 3 for Multimodal Audio-textual Architecture for Robust Spoken Language Understanding
Figure 4 for Multimodal Audio-textual Architecture for Robust Spoken Language Understanding
Viaarxiv icon

Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition

Oct 27, 2022
Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang

Figure 1 for Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Figure 2 for Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Figure 3 for Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Figure 4 for Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Viaarxiv icon

Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition

Nov 10, 2022
Zili Huang, Zhuo Chen, Naoyuki Kanda, Jian Wu, Yiming Wang, Jinyu Li, Takuya Yoshioka, Xiaofei Wang, Peidong Wang

Figure 1 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 2 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 3 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 4 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Viaarxiv icon

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data

May 25, 2023
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa, Taichi Asami

Figure 1 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 2 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 3 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 4 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Viaarxiv icon

Svarah: Evaluating English ASR Systems on Indian Accents

May 25, 2023
Tahir Javed, Sakshi Joshi, Vignesh Nagarajan, Sai Sundaresan, Janki Nawale, Abhigyan Raman, Kaushal Bhogale, Pratyush Kumar, Mitesh M. Khapra

Figure 1 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 2 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 3 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 4 for Svarah: Evaluating English ASR Systems on Indian Accents
Viaarxiv icon

The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jun 20, 2022
Jonathan Mukiibi, Andrew Katumba, Joyce Nakatumba-Nabende, Ali Hussein, Josh Meyer

Figure 1 for The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
Figure 2 for The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
Figure 3 for The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
Figure 4 for The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
Viaarxiv icon

CopyNE: Better Contextual ASR by Copying Named Entities

Add code
Bookmark button
Alert button
May 22, 2023
Shilin Zhou, Zhenghua Li, Yu Hong, Min Zhang, Zhefeng Wang, Baoxing Huai

Figure 1 for CopyNE: Better Contextual ASR by Copying Named Entities
Figure 2 for CopyNE: Better Contextual ASR by Copying Named Entities
Figure 3 for CopyNE: Better Contextual ASR by Copying Named Entities
Figure 4 for CopyNE: Better Contextual ASR by Copying Named Entities
Viaarxiv icon