Alert button
Picture for Duc Le

Duc Le

Alert button

Seq2seq for Automatic Paraphasia Detection in Aphasic Speech

Add code
Bookmark button
Alert button
Dec 16, 2023
Matthew Perez, Duc Le, Amrit Romana, Elise Jones, Keli Licata, Emily Mower Provost

Viaarxiv icon

StemGen: A music generation model that listens

Add code
Bookmark button
Alert button
Dec 14, 2023
Julian D. Parker, Janne Spijkervet, Katerina Kosta, Furkan Yesiler, Boris Kuznetsov, Ju-Chiang Wang, Matt Avent, Jitong Chen, Duc Le

Figure 1 for StemGen: A music generation model that listens
Figure 2 for StemGen: A music generation model that listens
Viaarxiv icon

A Foundation Model for Music Informatics

Add code
Bookmark button
Alert button
Nov 06, 2023
Minz Won, Yun-Ning Hung, Duc Le

Viaarxiv icon

Scaling Up Music Information Retrieval Training with Semi-Supervised Learning

Add code
Bookmark button
Alert button
Oct 02, 2023
Yun-Ning Hung, Ju-Chiang Wang, Minz Won, Duc Le

Viaarxiv icon

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding

Add code
Bookmark button
Alert button
Jul 22, 2023
Suyoun Kim, Akshat Shrivastava, Duc Le, Ju Lin, Ozlem Kalinli, Michael L. Seltzer

Figure 1 for Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Figure 2 for Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Figure 3 for Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Figure 4 for Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Viaarxiv icon

Text Generation with Speech Synthesis for ASR Data Augmentation

Add code
Bookmark button
Alert button
May 22, 2023
Zhuangqun Huang, Gil Keren, Ziran Jiang, Shashank Jain, David Goss-Grubbs, Nelson Cheng, Farnaz Abtahi, Duc Le, David Zhang, Antony D'Avirro, Ethan Campbell-Taylor, Jessie Salas, Irina-Elena Veliche, Xi Chen

Figure 1 for Text Generation with Speech Synthesis for ASR Data Augmentation
Figure 2 for Text Generation with Speech Synthesis for ASR Data Augmentation
Figure 3 for Text Generation with Speech Synthesis for ASR Data Augmentation
Viaarxiv icon

Improving Fast-slow Encoder based Transducer with Streaming Deliberation

Add code
Bookmark button
Alert button
Dec 15, 2022
Ke Li, Jay Mahadeokar, Jinxi Guo, Yangyang Shi, Gil Keren, Ozlem Kalinli, Michael L. Seltzer, Duc Le

Figure 1 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 2 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 3 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 4 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Viaarxiv icon

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities

Add code
Bookmark button
Alert button
Nov 10, 2022
Andros Tjandra, Nayan Singhal, David Zhang, Ozlem Kalinli, Abdelrahman Mohamed, Duc Le, Michael L. Seltzer

Figure 1 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 2 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 3 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 4 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Viaarxiv icon

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers

Add code
Bookmark button
Alert button
Nov 02, 2022
Duc Le, Frank Seide, Yuhao Wang, Yang Li, Kjell Schubert, Ozlem Kalinli, Michael L. Seltzer

Figure 1 for Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers
Figure 2 for Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers
Figure 3 for Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers
Figure 4 for Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers
Viaarxiv icon

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition

Add code
Bookmark button
Alert button
Oct 31, 2022
Suyoun Kim, Ke Li, Lucas Kabela, Rongqing Huang, Jiedan Zhu, Ozlem Kalinli, Duc Le

Figure 1 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 2 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 3 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Figure 4 for Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Viaarxiv icon