Picture for Changhan Wang

Changhan Wang

XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

Add code
Mar 21, 2024
Figure 1 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 2 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 3 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 4 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Viaarxiv icon

An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis

Add code
Mar 19, 2024
Figure 1 for An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis
Figure 2 for An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis
Figure 3 for An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis
Figure 4 for An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis
Viaarxiv icon

MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation

Add code
Mar 19, 2024
Figure 1 for MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
Figure 2 for MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
Figure 3 for MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
Figure 4 for MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
Viaarxiv icon

Seamless: Multilingual Expressive and Streaming Speech Translation

Add code
Dec 08, 2023
Figure 1 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 2 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 3 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 4 for Seamless: Multilingual Expressive and Streaming Speech Translation
Viaarxiv icon

Exploring Speech Enhancement for Low-resource Speech Synthesis

Add code
Sep 19, 2023
Figure 1 for Exploring Speech Enhancement for Low-resource Speech Synthesis
Figure 2 for Exploring Speech Enhancement for Low-resource Speech Synthesis
Figure 3 for Exploring Speech Enhancement for Low-resource Speech Synthesis
Figure 4 for Exploring Speech Enhancement for Low-resource Speech Synthesis
Viaarxiv icon

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Add code
Aug 23, 2023
Figure 1 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 2 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 3 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 4 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Viaarxiv icon

Enhancing Speech-to-Speech Translation with Multiple TTS Targets

Add code
Apr 10, 2023
Figure 1 for Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Figure 2 for Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Figure 3 for Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Figure 4 for Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Viaarxiv icon

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Add code
Mar 07, 2023
Figure 1 for MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Figure 2 for MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Figure 3 for MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Figure 4 for MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Viaarxiv icon

Pre-training for Speech Translation: CTC Meets Optimal Transport

Add code
Jan 27, 2023
Figure 1 for Pre-training for Speech Translation: CTC Meets Optimal Transport
Figure 2 for Pre-training for Speech Translation: CTC Meets Optimal Transport
Figure 3 for Pre-training for Speech Translation: CTC Meets Optimal Transport
Figure 4 for Pre-training for Speech Translation: CTC Meets Optimal Transport
Viaarxiv icon

A Holistic Cascade System, benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation

Add code
Jan 25, 2023
Figure 1 for A Holistic Cascade System, benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation
Figure 2 for A Holistic Cascade System, benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation
Figure 3 for A Holistic Cascade System, benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation
Figure 4 for A Holistic Cascade System, benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation
Viaarxiv icon