Picture for Yu Tsao

Yu Tsao

Graduate Program of Data Science, National Taiwan University and Academia Sinica, Taipei, Taiwan, Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan

Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN

Add code
Sep 21, 2022
Figure 1 for Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN
Figure 2 for Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN
Figure 3 for Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN
Figure 4 for Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN
Viaarxiv icon

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

Add code
Jul 19, 2022
Figure 1 for ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Figure 2 for ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Figure 3 for ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Figure 4 for ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Viaarxiv icon

NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling

Add code
Jun 18, 2022
Figure 1 for NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling
Figure 2 for NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling
Figure 3 for NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling
Figure 4 for NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling
Viaarxiv icon

EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning

Add code
Jun 16, 2022
Figure 1 for EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning
Figure 2 for EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning
Figure 3 for EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning
Figure 4 for EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning
Viaarxiv icon

XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding

Add code
Apr 29, 2022
Figure 1 for XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding
Figure 2 for XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding
Figure 3 for XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding
Figure 4 for XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding
Viaarxiv icon

A Study of Using Cepstrogram for Countermeasure Against Replay Attacks

Add code
Apr 09, 2022
Figure 1 for A Study of Using Cepstrogram for Countermeasure Against Replay Attacks
Figure 2 for A Study of Using Cepstrogram for Countermeasure Against Replay Attacks
Figure 3 for A Study of Using Cepstrogram for Countermeasure Against Replay Attacks
Figure 4 for A Study of Using Cepstrogram for Countermeasure Against Replay Attacks
Viaarxiv icon

Boosting Self-Supervised Embeddings for Speech Enhancement

Add code
Apr 07, 2022
Figure 1 for Boosting Self-Supervised Embeddings for Speech Enhancement
Figure 2 for Boosting Self-Supervised Embeddings for Speech Enhancement
Figure 3 for Boosting Self-Supervised Embeddings for Speech Enhancement
Figure 4 for Boosting Self-Supervised Embeddings for Speech Enhancement
Viaarxiv icon

MTI-Net: A Multi-Target Speech Intelligibility Prediction Model

Add code
Apr 07, 2022
Figure 1 for MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
Figure 2 for MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
Figure 3 for MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
Figure 4 for MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
Viaarxiv icon

MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids

Add code
Apr 07, 2022
Figure 1 for MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Figure 2 for MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Figure 3 for MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Figure 4 for MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Viaarxiv icon

Perceptual Contrast Stretching on Target Feature for Speech Enhancement

Add code
Apr 01, 2022
Figure 1 for Perceptual Contrast Stretching on Target Feature for Speech Enhancement
Figure 2 for Perceptual Contrast Stretching on Target Feature for Speech Enhancement
Figure 3 for Perceptual Contrast Stretching on Target Feature for Speech Enhancement
Figure 4 for Perceptual Contrast Stretching on Target Feature for Speech Enhancement
Viaarxiv icon