Picture for Daniel Povey

Daniel Povey

Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation

Add code
Jul 14, 2024
Viaarxiv icon

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment

Add code
Jun 17, 2024
Figure 1 for Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Figure 2 for Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Figure 3 for Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Figure 4 for Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Viaarxiv icon

SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM

Add code
Jun 03, 2024
Viaarxiv icon

On Speaker Attribution with SURT

Add code
Jan 28, 2024
Viaarxiv icon

Zipformer: A faster and better encoder for automatic speech recognition

Add code
Oct 17, 2023
Figure 1 for Zipformer: A faster and better encoder for automatic speech recognition
Figure 2 for Zipformer: A faster and better encoder for automatic speech recognition
Figure 3 for Zipformer: A faster and better encoder for automatic speech recognition
Figure 4 for Zipformer: A faster and better encoder for automatic speech recognition
Viaarxiv icon

Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition

Add code
Sep 26, 2023
Figure 1 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 2 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 3 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 4 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Viaarxiv icon

PromptASR for contextualized ASR with controllable style

Add code
Sep 20, 2023
Figure 1 for PromptASR for contextualized ASR with controllable style
Figure 2 for PromptASR for contextualized ASR with controllable style
Figure 3 for PromptASR for contextualized ASR with controllable style
Figure 4 for PromptASR for contextualized ASR with controllable style
Viaarxiv icon

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Add code
Sep 15, 2023
Figure 1 for Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Figure 2 for Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Figure 3 for Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Figure 4 for Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Viaarxiv icon

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

Add code
Sep 14, 2023
Figure 1 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 2 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 3 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 4 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Viaarxiv icon

Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition

Add code
Aug 12, 2023
Figure 1 for Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Figure 2 for Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Figure 3 for Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Figure 4 for Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Viaarxiv icon