Picture for Andros Tjandra

Andros Tjandra

Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning

Add code
Jun 10, 2024
Viaarxiv icon

Audiobox: Unified Audio Generation with Natural Language Prompts

Add code
Dec 25, 2023
Viaarxiv icon

Generative Pre-training for Speech with Flow Matching

Add code
Oct 25, 2023
Figure 1 for Generative Pre-training for Speech with Flow Matching
Figure 2 for Generative Pre-training for Speech with Flow Matching
Figure 3 for Generative Pre-training for Speech with Flow Matching
Figure 4 for Generative Pre-training for Speech with Flow Matching
Viaarxiv icon

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Add code
Sep 22, 2023
Figure 1 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 2 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 3 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 4 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Viaarxiv icon

Scaling Speech Technology to 1,000+ Languages

Add code
May 22, 2023
Figure 1 for Scaling Speech Technology to 1,000+ Languages
Figure 2 for Scaling Speech Technology to 1,000+ Languages
Figure 3 for Scaling Speech Technology to 1,000+ Languages
Figure 4 for Scaling Speech Technology to 1,000+ Languages
Viaarxiv icon

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain

Add code
Jan 08, 2023
Figure 1 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 2 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 3 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 4 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Viaarxiv icon

Voice-preserving Zero-shot Multiple Accent Conversion

Add code
Nov 23, 2022
Figure 1 for Voice-preserving Zero-shot Multiple Accent Conversion
Figure 2 for Voice-preserving Zero-shot Multiple Accent Conversion
Figure 3 for Voice-preserving Zero-shot Multiple Accent Conversion
Figure 4 for Voice-preserving Zero-shot Multiple Accent Conversion
Viaarxiv icon

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities

Add code
Nov 10, 2022
Figure 1 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 2 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 3 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 4 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Viaarxiv icon

Learning ASR pathways: A sparse multilingual ASR model

Add code
Sep 13, 2022
Figure 1 for Learning ASR pathways: A sparse multilingual ASR model
Figure 2 for Learning ASR pathways: A sparse multilingual ASR model
Figure 3 for Learning ASR pathways: A sparse multilingual ASR model
Figure 4 for Learning ASR pathways: A sparse multilingual ASR model
Viaarxiv icon

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

Add code
Nov 19, 2021
Figure 1 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 2 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 3 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 4 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Viaarxiv icon