Alert button
Picture for Wei-Ning Hsu

Wei-Ning Hsu

Alert button

XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

Add code
Bookmark button
Alert button
Mar 21, 2024
HyoJung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang

Figure 1 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 2 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 3 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 4 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Viaarxiv icon

Audiobox: Unified Audio Generation with Natural Language Prompts

Add code
Bookmark button
Alert button
Dec 25, 2023
Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu

Viaarxiv icon

Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

Add code
Bookmark button
Alert button
Nov 05, 2023
Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel

Viaarxiv icon

Generative Pre-training for Speech with Flow Matching

Add code
Bookmark button
Alert button
Oct 25, 2023
Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu

Viaarxiv icon

Toward Joint Language Modeling for Speech Units and Text

Add code
Bookmark button
Alert button
Oct 12, 2023
Ju-Chieh Chou, Chung-Ming Chien, Wei-Ning Hsu, Karen Livescu, Arun Babu, Alexis Conneau, Alexei Baevski, Michael Auli

Figure 1 for Toward Joint Language Modeling for Speech Units and Text
Figure 2 for Toward Joint Language Modeling for Speech Units and Text
Figure 3 for Toward Joint Language Modeling for Speech Units and Text
Figure 4 for Toward Joint Language Modeling for Speech Units and Text
Viaarxiv icon

Low-Resource Self-Supervised Learning with SSL-Enhanced TTS

Add code
Bookmark button
Alert button
Sep 29, 2023
Po-chun Hsu, Ali Elkahky, Wei-Ning Hsu, Yossi Adi, Tu Anh Nguyen, Jade Copet, Emmanuel Dupoux, Hung-yi Lee, Abdelrahman Mohamed

Figure 1 for Low-Resource Self-Supervised Learning with SSL-Enhanced TTS
Figure 2 for Low-Resource Self-Supervised Learning with SSL-Enhanced TTS
Figure 3 for Low-Resource Self-Supervised Learning with SSL-Enhanced TTS
Figure 4 for Low-Resource Self-Supervised Learning with SSL-Enhanced TTS
Viaarxiv icon

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

Add code
Bookmark button
Alert button
Aug 10, 2023
Tu Anh Nguyen, Wei-Ning Hsu, Antony D'Avirro, Bowen Shi, Itai Gat, Maryam Fazel-Zarani, Tal Remez, Jade Copet, Gabriel Synnaeve, Michael Hassid, Felix Kreuk, Yossi Adi, Emmanuel Dupoux

Figure 1 for EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Figure 2 for EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Figure 3 for EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Figure 4 for EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Viaarxiv icon

Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale

Add code
Bookmark button
Alert button
Jun 23, 2023
Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu

Figure 1 for Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Figure 2 for Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Figure 3 for Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Figure 4 for Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Viaarxiv icon

Scaling Speech Technology to 1,000+ Languages

Add code
Bookmark button
Alert button
May 22, 2023
Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli

Figure 1 for Scaling Speech Technology to 1,000+ Languages
Figure 2 for Scaling Speech Technology to 1,000+ Languages
Figure 3 for Scaling Speech Technology to 1,000+ Languages
Figure 4 for Scaling Speech Technology to 1,000+ Languages
Viaarxiv icon

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Add code
Bookmark button
Alert button
May 17, 2023
Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass

Figure 1 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 2 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 3 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 4 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Viaarxiv icon