Picture for Dung Tran

Dung Tran

LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes

Add code
Jun 05, 2024
Viaarxiv icon

uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures

Add code
Mar 14, 2024
Figure 1 for uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures
Figure 2 for uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures
Figure 3 for uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures
Figure 4 for uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures
Viaarxiv icon

Learned Image Compression with Text Quality Enhancement

Add code
Feb 13, 2024
Figure 1 for Learned Image Compression with Text Quality Enhancement
Figure 2 for Learned Image Compression with Text Quality Enhancement
Figure 3 for Learned Image Compression with Text Quality Enhancement
Figure 4 for Learned Image Compression with Text Quality Enhancement
Viaarxiv icon

Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation

Add code
Sep 19, 2023
Figure 1 for Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Figure 2 for Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Figure 3 for Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Viaarxiv icon

Corrupting Data to Remove Deceptive Perturbation: Using Preprocessing Method to Improve System Robustness

Add code
Jan 05, 2022
Figure 1 for Corrupting Data to Remove Deceptive Perturbation: Using Preprocessing Method to Improve System Robustness
Figure 2 for Corrupting Data to Remove Deceptive Perturbation: Using Preprocessing Method to Improve System Robustness
Figure 3 for Corrupting Data to Remove Deceptive Perturbation: Using Preprocessing Method to Improve System Robustness
Figure 4 for Corrupting Data to Remove Deceptive Perturbation: Using Preprocessing Method to Improve System Robustness
Viaarxiv icon

Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations

Add code
Dec 21, 2021
Figure 1 for Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations
Figure 2 for Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations
Figure 3 for Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations
Figure 4 for Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations
Viaarxiv icon

Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features

Add code
Dec 08, 2021
Figure 1 for Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Figure 2 for Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Figure 3 for Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Viaarxiv icon

An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions

Add code
Apr 28, 2019
Figure 1 for An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions
Figure 2 for An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions
Figure 3 for An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions
Viaarxiv icon

Speaker Selective Beamformer with Keyword Mask Estimation

Add code
Oct 25, 2018
Figure 1 for Speaker Selective Beamformer with Keyword Mask Estimation
Figure 2 for Speaker Selective Beamformer with Keyword Mask Estimation
Figure 3 for Speaker Selective Beamformer with Keyword Mask Estimation
Figure 4 for Speaker Selective Beamformer with Keyword Mask Estimation
Viaarxiv icon