Picture for Wei-Ning Hsu

Wei-Ning Hsu

Generative Spoken Dialogue Language Modeling

Add code
Mar 30, 2022
Figure 1 for Generative Spoken Dialogue Language Modeling
Figure 2 for Generative Spoken Dialogue Language Modeling
Figure 3 for Generative Spoken Dialogue Language Modeling
Figure 4 for Generative Spoken Dialogue Language Modeling
Viaarxiv icon

Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training

Add code
Mar 02, 2022
Figure 1 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 2 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 3 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 4 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Viaarxiv icon

textless-lib: a Library for Textless Spoken Language Processing

Add code
Feb 15, 2022
Figure 1 for textless-lib: a Library for Textless Spoken Language Processing
Figure 2 for textless-lib: a Library for Textless Spoken Language Processing
Figure 3 for textless-lib: a Library for Textless Spoken Language Processing
Figure 4 for textless-lib: a Library for Textless Spoken Language Processing
Viaarxiv icon

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

Add code
Feb 07, 2022
Figure 1 for data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Figure 2 for data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Figure 3 for data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Figure 4 for data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Viaarxiv icon

Robust Self-Supervised Audio-Visual Speech Recognition

Add code
Jan 05, 2022
Figure 1 for Robust Self-Supervised Audio-Visual Speech Recognition
Figure 2 for Robust Self-Supervised Audio-Visual Speech Recognition
Figure 3 for Robust Self-Supervised Audio-Visual Speech Recognition
Figure 4 for Robust Self-Supervised Audio-Visual Speech Recognition
Viaarxiv icon

Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction

Add code
Jan 05, 2022
Figure 1 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 2 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 3 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 4 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Viaarxiv icon

Textless Speech-to-Speech Translation on Real Data

Add code
Dec 15, 2021
Figure 1 for Textless Speech-to-Speech Translation on Real Data
Figure 2 for Textless Speech-to-Speech Translation on Real Data
Figure 3 for Textless Speech-to-Speech Translation on Real Data
Figure 4 for Textless Speech-to-Speech Translation on Real Data
Viaarxiv icon

Textless Speech Emotion Conversion using Decomposed and Discrete Representations

Add code
Nov 14, 2021
Figure 1 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 2 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 3 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 4 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Viaarxiv icon

Direct simultaneous speech to speech translation

Add code
Oct 15, 2021
Figure 1 for Direct simultaneous speech to speech translation
Figure 2 for Direct simultaneous speech to speech translation
Viaarxiv icon

fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit

Add code
Sep 14, 2021
Figure 1 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 2 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 3 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Figure 4 for fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Viaarxiv icon