Picture for Wei-Ning Hsu

Wei-Ning Hsu

Robust Self-Supervised Audio-Visual Speech Recognition

Add code
Jan 05, 2022
Figure 1 for Robust Self-Supervised Audio-Visual Speech Recognition
Figure 2 for Robust Self-Supervised Audio-Visual Speech Recognition
Figure 3 for Robust Self-Supervised Audio-Visual Speech Recognition
Figure 4 for Robust Self-Supervised Audio-Visual Speech Recognition
Viaarxiv icon

Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction

Add code
Jan 05, 2022
Figure 1 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 2 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 3 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 4 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Viaarxiv icon

Textless Speech-to-Speech Translation on Real Data

Add code
Dec 15, 2021
Figure 1 for Textless Speech-to-Speech Translation on Real Data
Figure 2 for Textless Speech-to-Speech Translation on Real Data
Figure 3 for Textless Speech-to-Speech Translation on Real Data
Figure 4 for Textless Speech-to-Speech Translation on Real Data
Viaarxiv icon

Textless Speech Emotion Conversion using Decomposed and Discrete Representations

Add code
Nov 14, 2021
Figure 1 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 2 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 3 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 4 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Viaarxiv icon

Direct simultaneous speech to speech translation

Add code
Oct 15, 2021
Figure 1 for Direct simultaneous speech to speech translation
Figure 2 for Direct simultaneous speech to speech translation
Viaarxiv icon

fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit

Add code
Sep 14, 2021
Figure 1 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 2 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 3 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 4 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Viaarxiv icon

Text-Free Prosody-Aware Generative Spoken Language Modeling

Add code
Sep 07, 2021
Figure 1 for Text-Free Prosody-Aware Generative Spoken Language Modeling
Figure 2 for Text-Free Prosody-Aware Generative Spoken Language Modeling
Figure 3 for Text-Free Prosody-Aware Generative Spoken Language Modeling
Figure 4 for Text-Free Prosody-Aware Generative Spoken Language Modeling
Viaarxiv icon

Direct speech-to-speech translation with discrete units

Add code
Jul 12, 2021
Figure 1 for Direct speech-to-speech translation with discrete units
Figure 2 for Direct speech-to-speech translation with discrete units
Figure 3 for Direct speech-to-speech translation with discrete units
Figure 4 for Direct speech-to-speech translation with discrete units
Viaarxiv icon

Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition

Add code
Jun 14, 2021
Figure 1 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 2 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 3 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 4 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Viaarxiv icon

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Add code
Jun 14, 2021
Figure 1 for HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Figure 2 for HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Figure 3 for HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Figure 4 for HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Viaarxiv icon