Picture for Daniel Povey

Daniel Povey

Zipformer: A faster and better encoder for automatic speech recognition

Add code
Oct 17, 2023
Figure 1 for Zipformer: A faster and better encoder for automatic speech recognition
Figure 2 for Zipformer: A faster and better encoder for automatic speech recognition
Figure 3 for Zipformer: A faster and better encoder for automatic speech recognition
Figure 4 for Zipformer: A faster and better encoder for automatic speech recognition
Viaarxiv icon

Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition

Add code
Sep 26, 2023
Figure 1 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 2 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 3 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 4 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Viaarxiv icon

PromptASR for contextualized ASR with controllable style

Add code
Sep 20, 2023
Viaarxiv icon

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Add code
Sep 15, 2023
Figure 1 for Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Figure 2 for Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Figure 3 for Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Figure 4 for Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Viaarxiv icon

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

Add code
Sep 14, 2023
Figure 1 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 2 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 3 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 4 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Viaarxiv icon

Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition

Add code
Aug 12, 2023
Figure 1 for Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Figure 2 for Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Figure 3 for Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Figure 4 for Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Viaarxiv icon

SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition

Add code
Jun 18, 2023
Viaarxiv icon

Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts

Add code
Jun 01, 2023
Figure 1 for Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Figure 2 for Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Figure 3 for Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Figure 4 for Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Viaarxiv icon

Blank-regularized CTC for Frame Skipping in Neural Transducer

Add code
May 19, 2023
Figure 1 for Blank-regularized CTC for Frame Skipping in Neural Transducer
Figure 2 for Blank-regularized CTC for Frame Skipping in Neural Transducer
Figure 3 for Blank-regularized CTC for Frame Skipping in Neural Transducer
Figure 4 for Blank-regularized CTC for Frame Skipping in Neural Transducer
Viaarxiv icon

Delay-penalized CTC implemented based on Finite State Transducer

Add code
May 19, 2023
Figure 1 for Delay-penalized CTC implemented based on Finite State Transducer
Figure 2 for Delay-penalized CTC implemented based on Finite State Transducer
Figure 3 for Delay-penalized CTC implemented based on Finite State Transducer
Figure 4 for Delay-penalized CTC implemented based on Finite State Transducer
Viaarxiv icon