Picture for Disong Wang

Disong Wang

Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction

Add code
Jan 31, 2024
Figure 1 for Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction
Figure 2 for Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction
Figure 3 for Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction
Figure 4 for Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction
Viaarxiv icon

UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization

Add code
Jan 26, 2024
Viaarxiv icon

Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $β$-VAE

Add code
Oct 25, 2022
Figure 1 for Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $β$-VAE
Figure 2 for Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $β$-VAE
Figure 3 for Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $β$-VAE
Figure 4 for Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $β$-VAE
Viaarxiv icon

Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation

Add code
Feb 18, 2022
Figure 1 for Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation
Figure 2 for Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation
Figure 3 for Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation
Figure 4 for Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation
Viaarxiv icon

VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion

Add code
Feb 18, 2022
Figure 1 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 2 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 3 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 4 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Viaarxiv icon

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

Add code
Jun 18, 2021
Figure 1 for VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Figure 2 for VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Figure 3 for VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Figure 4 for VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Viaarxiv icon

Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization

Add code
Jun 18, 2021
Figure 1 for Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization
Figure 2 for Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization
Figure 3 for Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization
Figure 4 for Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization
Viaarxiv icon

Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling

Add code
Sep 06, 2020
Figure 1 for Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Figure 2 for Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Figure 3 for Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Figure 4 for Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Viaarxiv icon