Alert button
Picture for Andros Tjandra

Andros Tjandra

Alert button

Audiobox: Unified Audio Generation with Natural Language Prompts

Dec 25, 2023
Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu

Viaarxiv icon

Generative Pre-training for Speech with Flow Matching

Oct 25, 2023
Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu

Viaarxiv icon

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Sep 22, 2023
Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli

Figure 1 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 2 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 3 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 4 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Viaarxiv icon

Scaling Speech Technology to 1,000+ Languages

May 22, 2023
Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli

Figure 1 for Scaling Speech Technology to 1,000+ Languages
Figure 2 for Scaling Speech Technology to 1,000+ Languages
Figure 3 for Scaling Speech Technology to 1,000+ Languages
Figure 4 for Scaling Speech Technology to 1,000+ Languages
Viaarxiv icon

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain

Jan 08, 2023
Heli Qi, Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Figure 1 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 2 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 3 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 4 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Viaarxiv icon

Voice-preserving Zero-shot Multiple Accent Conversion

Nov 23, 2022
Mumin Jin, Prashant Serai, Jilong Wu, Andros Tjandra, Vimal Manohar, Qing He

Figure 1 for Voice-preserving Zero-shot Multiple Accent Conversion
Figure 2 for Voice-preserving Zero-shot Multiple Accent Conversion
Figure 3 for Voice-preserving Zero-shot Multiple Accent Conversion
Figure 4 for Voice-preserving Zero-shot Multiple Accent Conversion
Viaarxiv icon

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities

Nov 10, 2022
Andros Tjandra, Nayan Singhal, David Zhang, Ozlem Kalinli, Abdelrahman Mohamed, Duc Le, Michael L. Seltzer

Figure 1 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 2 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 3 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Figure 4 for Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Viaarxiv icon

Learning ASR pathways: A sparse multilingual ASR model

Sep 13, 2022
Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, John H. L. Hansen, Ozlem Kalinli

Figure 1 for Learning ASR pathways: A sparse multilingual ASR model
Figure 2 for Learning ASR pathways: A sparse multilingual ASR model
Figure 3 for Learning ASR pathways: A sparse multilingual ASR model
Figure 4 for Learning ASR pathways: A sparse multilingual ASR model
Viaarxiv icon

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

Nov 19, 2021
Arun Babu, Changhan Wang, Andros Tjandra, Kushal Lakhotia, Qiantong Xu, Naman Goyal, Kritika Singh, Patrick von Platen, Yatharth Saraf, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

Figure 1 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 2 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 3 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Figure 4 for XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Viaarxiv icon