Picture for Shujie Hu

Shujie Hu

Autoregressive Speech Synthesis without Vector Quantization

Add code
Jul 11, 2024
Viaarxiv icon

Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation

Add code
Jul 08, 2024
Viaarxiv icon

Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition

Add code
Jun 14, 2024
Figure 1 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 2 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 3 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 4 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Viaarxiv icon

Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask

Add code
Jun 14, 2024
Figure 1 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 2 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 3 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 4 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Viaarxiv icon

One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model

Add code
Jun 14, 2024
Viaarxiv icon

WavLLM: Towards Robust and Adaptive Speech Large Language Model

Add code
Mar 31, 2024
Figure 1 for WavLLM: Towards Robust and Adaptive Speech Large Language Model
Figure 2 for WavLLM: Towards Robust and Adaptive Speech Large Language Model
Figure 3 for WavLLM: Towards Robust and Adaptive Speech Large Language Model
Figure 4 for WavLLM: Towards Robust and Adaptive Speech Large Language Model
Viaarxiv icon

Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation

Add code
Jan 01, 2024
Figure 1 for Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
Figure 2 for Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
Figure 3 for Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
Viaarxiv icon

Boosting Large Language Model for Speech Synthesis: An Empirical Study

Add code
Dec 30, 2023
Viaarxiv icon

Towards Automatic Data Augmentation for Disordered Speech Recognition

Add code
Dec 14, 2023
Figure 1 for Towards Automatic Data Augmentation for Disordered Speech Recognition
Figure 2 for Towards Automatic Data Augmentation for Disordered Speech Recognition
Figure 3 for Towards Automatic Data Augmentation for Disordered Speech Recognition
Figure 4 for Towards Automatic Data Augmentation for Disordered Speech Recognition
Viaarxiv icon

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Add code
Jul 06, 2023
Figure 1 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 2 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 3 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 4 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Viaarxiv icon