Alert button

"speech recognition": models, code, and papers
Alert button

FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Oct 18, 2021
Yichong Leng, Xu Tan, Rui Wang, Linchen Zhu, Jin Xu, Wenjie Liu, Linquan Liu, Tao Qin, Xiang-Yang Li, Edward Lin, Tie-Yan Liu

Figure 1 for FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition
Figure 2 for FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition
Figure 3 for FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition
Figure 4 for FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition
Viaarxiv icon

Measuring Equality in Machine Learning Security Defenses

Feb 17, 2023
Luke E. Richards, Edward Raff, Cynthia Matuszek

Figure 1 for Measuring Equality in Machine Learning Security Defenses
Figure 2 for Measuring Equality in Machine Learning Security Defenses
Figure 3 for Measuring Equality in Machine Learning Security Defenses
Figure 4 for Measuring Equality in Machine Learning Security Defenses
Viaarxiv icon

AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages

Mar 22, 2023
Chris Chinenye Emezue, Sanchit Gandhi, Lewis Tunstall, Abubakar Abid, Joshua Meyer, Quentin Lhoest, Pete Allen, Patrick Von Platen, Douwe Kiela, Yacine Jernite, Julien Chaumond, Merve Noyan, Omar Sanseviero

Figure 1 for AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Figure 2 for AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Figure 3 for AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Figure 4 for AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Viaarxiv icon

Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition

Nov 09, 2020
Wei Zhou, Simon Berger, Ralf Schlüter, Hermann Ney

Figure 1 for Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Figure 2 for Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Figure 3 for Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Figure 4 for Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Viaarxiv icon

Fast and parallel decoding for transducer

Add code
Bookmark button
Alert button
Oct 31, 2022
Wei Kang, Liyong Guo, Fangjun Kuang, Long Lin, Mingshuang Luo, Zengwei Yao, Xiaoyu Yang, Piotr Żelasko, Daniel Povey

Figure 1 for Fast and parallel decoding for transducer
Figure 2 for Fast and parallel decoding for transducer
Figure 3 for Fast and parallel decoding for transducer
Figure 4 for Fast and parallel decoding for transducer
Viaarxiv icon

Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments

Jun 13, 2019
Guan-Lin Chao, William Chan, Ian Lane

Figure 1 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 2 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 3 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 4 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Viaarxiv icon

Self-training and Pre-training are Complementary for Speech Recognition

Add code
Bookmark button
Alert button
Oct 22, 2020
Qiantong Xu, Alexei Baevski, Tatiana Likhomanenko, Paden Tomasello, Alexis Conneau, Ronan Collobert, Gabriel Synnaeve, Michael Auli

Figure 1 for Self-training and Pre-training are Complementary for Speech Recognition
Figure 2 for Self-training and Pre-training are Complementary for Speech Recognition
Figure 3 for Self-training and Pre-training are Complementary for Speech Recognition
Figure 4 for Self-training and Pre-training are Complementary for Speech Recognition
Viaarxiv icon

CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

Add code
Bookmark button
Alert button
Nov 07, 2018
Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, Tetsuya Ogata

Figure 1 for CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments
Figure 2 for CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments
Figure 3 for CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments
Figure 4 for CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments
Viaarxiv icon

Provable Robustness for Streaming Models with a Sliding Window

Mar 28, 2023
Aounon Kumar, Vinu Sankar Sadasivan, Soheil Feizi

Figure 1 for Provable Robustness for Streaming Models with a Sliding Window
Figure 2 for Provable Robustness for Streaming Models with a Sliding Window
Figure 3 for Provable Robustness for Streaming Models with a Sliding Window
Figure 4 for Provable Robustness for Streaming Models with a Sliding Window
Viaarxiv icon

Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition

Oct 26, 2019
Thejan Rajapakshe, Rajib Rana, Siddique Latif, Sara Khalifa, Björn W. Schuller

Figure 1 for Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Figure 2 for Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Figure 3 for Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Figure 4 for Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Viaarxiv icon