Picture for Duc Le

Duc Le

Jack

Scaling Up Music Information Retrieval Training with Semi-Supervised Learning

Add code
Oct 02, 2023
Viaarxiv icon

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding

Add code
Jul 22, 2023
Viaarxiv icon

Text Generation with Speech Synthesis for ASR Data Augmentation

Add code
May 22, 2023
Figure 1 for Text Generation with Speech Synthesis for ASR Data Augmentation
Figure 2 for Text Generation with Speech Synthesis for ASR Data Augmentation
Figure 3 for Text Generation with Speech Synthesis for ASR Data Augmentation
Viaarxiv icon

Improving Fast-slow Encoder based Transducer with Streaming Deliberation

Add code
Dec 15, 2022
Viaarxiv icon

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities

Add code
Nov 10, 2022
Figure 1 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 2 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 3 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 4 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Viaarxiv icon

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers

Add code
Nov 02, 2022
Viaarxiv icon

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition

Add code
Oct 31, 2022
Figure 1 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 2 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 3 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 4 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Viaarxiv icon

Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning

Add code
Sep 30, 2022
Figure 1 for Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning
Figure 2 for Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning
Figure 3 for Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning
Figure 4 for Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning
Viaarxiv icon

Learning ASR pathways: A sparse multilingual ASR model

Add code
Sep 13, 2022
Figure 1 for Learning ASR pathways: A sparse multilingual ASR model
Figure 2 for Learning ASR pathways: A sparse multilingual ASR model
Figure 3 for Learning ASR pathways: A sparse multilingual ASR model
Figure 4 for Learning ASR pathways: A sparse multilingual ASR model
Viaarxiv icon

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition

Add code
Apr 19, 2022
Figure 1 for An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Figure 2 for An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Figure 3 for An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Figure 4 for An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Viaarxiv icon