Alert button

"speech recognition": models, code, and papers
Alert button

Phonetically-Oriented Word Error Alignment for Speech Recognition Error Analysis in Speech Translation

Add code
Bookmark button
Alert button
Apr 24, 2019
Nicholas Ruiz, Marcello Federico

Figure 1 for Phonetically-Oriented Word Error Alignment for Speech Recognition Error Analysis in Speech Translation
Figure 2 for Phonetically-Oriented Word Error Alignment for Speech Recognition Error Analysis in Speech Translation
Figure 3 for Phonetically-Oriented Word Error Alignment for Speech Recognition Error Analysis in Speech Translation
Figure 4 for Phonetically-Oriented Word Error Alignment for Speech Recognition Error Analysis in Speech Translation
Viaarxiv icon

WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition

Apr 08, 2021
Zhichao Wang, Wenwen Yang, Pan Zhou, Wei Chen

Figure 1 for WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition
Figure 2 for WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition
Figure 3 for WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition
Figure 4 for WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition
Viaarxiv icon

Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition

Jun 02, 2021
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoyuki Kamo

Figure 1 for Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition
Figure 2 for Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition
Figure 3 for Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition
Viaarxiv icon

CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Add code
Bookmark button
Alert button
Oct 14, 2021
Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Frederico Santos de Oliveira, Lucas Oliveira, Ricardo Corso Fernandes Junior, Daniel Peixoto Pinto da Silva, Fernando Gorgulho Fayet, Bruno Baldissera Carlotto, Lucas Rafael Stefanel Gris, Sandra Maria Aluísio

Figure 1 for CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
Figure 2 for CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
Figure 3 for CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
Figure 4 for CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
Viaarxiv icon

Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation

Mar 07, 2023
Bac Nguyen, Stefan Uhlich, Fabien Cardinaux

Figure 1 for Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Figure 2 for Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Figure 3 for Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Figure 4 for Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Viaarxiv icon

Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation

Nov 11, 2022
Motoi Omachi, Brian Yan, Siddharth Dalmia, Yuya Fujita, Shinji Watanabe

Figure 1 for Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Figure 2 for Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Figure 3 for Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Figure 4 for Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Viaarxiv icon

A Review of Sparse Expert Models in Deep Learning

Sep 04, 2022
William Fedus, Jeff Dean, Barret Zoph

Figure 1 for A Review of Sparse Expert Models in Deep Learning
Figure 2 for A Review of Sparse Expert Models in Deep Learning
Figure 3 for A Review of Sparse Expert Models in Deep Learning
Figure 4 for A Review of Sparse Expert Models in Deep Learning
Viaarxiv icon

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

Add code
Bookmark button
Alert button
Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 2 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 3 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 4 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Viaarxiv icon

BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder

Add code
Bookmark button
Alert button
Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 2 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 3 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 4 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Viaarxiv icon

ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale

Jul 19, 2022
Gopinath Chennupati, Milind Rao, Gurpreet Chadha, Aaron Eakin, Anirudh Raju, Gautam Tiwari, Anit Kumar Sahu, Ariya Rastrow, Jasha Droppo, Andy Oberlin, Buddha Nandanoor, Prahalad Venkataramanan, Zheng Wu, Pankaj Sitpure

Figure 1 for ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale
Figure 2 for ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale
Figure 3 for ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale
Figure 4 for ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale
Viaarxiv icon