Picture for Chang Zeng

Chang Zeng

A Benchmark for Multi-speaker Anonymization

Add code
Jul 08, 2024
Viaarxiv icon

HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling

Add code
Mar 09, 2024
Figure 1 for HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling
Figure 2 for HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling
Figure 3 for HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling
Figure 4 for HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling
Viaarxiv icon

CrossSinger: A Cross-Lingual Multi-Singer High-Fidelity Singing Voice Synthesizer Trained on Monolingual Singers

Add code
Sep 22, 2023
Viaarxiv icon

Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms

Add code
May 18, 2023
Figure 1 for Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms
Figure 2 for Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms
Figure 3 for Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms
Figure 4 for Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms
Viaarxiv icon

Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit

Add code
Mar 23, 2023
Figure 1 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 2 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 3 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 4 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Viaarxiv icon

Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification

Add code
Feb 22, 2023
Figure 1 for Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Figure 2 for Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Figure 3 for Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Figure 4 for Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Viaarxiv icon

Xiaoicesing 2: A High-Fidelity Singing Voice Synthesizer Based on Generative Adversarial Network

Add code
Oct 28, 2022
Figure 1 for Xiaoicesing 2: A High-Fidelity Singing Voice Synthesizer Based on Generative Adversarial Network
Figure 2 for Xiaoicesing 2: A High-Fidelity Singing Voice Synthesizer Based on Generative Adversarial Network
Figure 3 for Xiaoicesing 2: A High-Fidelity Singing Voice Synthesizer Based on Generative Adversarial Network
Viaarxiv icon

HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation

Add code
Oct 26, 2022
Figure 1 for HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Figure 2 for HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Figure 3 for HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Figure 4 for HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Viaarxiv icon

Deep Spectro-temporal Artifacts for Detecting Synthesized Speech

Add code
Oct 11, 2022
Figure 1 for Deep Spectro-temporal Artifacts for Detecting Synthesized Speech
Figure 2 for Deep Spectro-temporal Artifacts for Detecting Synthesized Speech
Figure 3 for Deep Spectro-temporal Artifacts for Detecting Synthesized Speech
Figure 4 for Deep Spectro-temporal Artifacts for Detecting Synthesized Speech
Viaarxiv icon

Exploring Deep Learning for Joint Audio-Visual Lip Biometrics

Add code
Apr 17, 2021
Figure 1 for Exploring Deep Learning for Joint Audio-Visual Lip Biometrics
Figure 2 for Exploring Deep Learning for Joint Audio-Visual Lip Biometrics
Figure 3 for Exploring Deep Learning for Joint Audio-Visual Lip Biometrics
Figure 4 for Exploring Deep Learning for Joint Audio-Visual Lip Biometrics
Viaarxiv icon