Picture for Zhengqi Wen

Zhengqi Wen

UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion

Add code
Jan 10, 2023
Viaarxiv icon

Emotion Selectable End-to-End Text-based Speech Editing

Add code
Dec 20, 2022
Viaarxiv icon

Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS

Add code
Oct 20, 2022
Viaarxiv icon

Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features

Add code
Aug 02, 2022
Figure 1 for Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features
Figure 2 for Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features
Figure 3 for Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features
Figure 4 for Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features
Viaarxiv icon

NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation

Add code
Mar 05, 2022
Figure 1 for NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Figure 2 for NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Figure 3 for NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Figure 4 for NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Viaarxiv icon

ADD 2022: the First Audio Deep Synthesis Detection Challenge

Add code
Feb 26, 2022
Figure 1 for ADD 2022: the First Audio Deep Synthesis Detection Challenge
Figure 2 for ADD 2022: the First Audio Deep Synthesis Detection Challenge
Figure 3 for ADD 2022: the First Audio Deep Synthesis Detection Challenge
Figure 4 for ADD 2022: the First Audio Deep Synthesis Detection Challenge
Viaarxiv icon

CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing

Add code
Feb 21, 2022
Figure 1 for CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Figure 2 for CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Figure 3 for CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Figure 4 for CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Viaarxiv icon

Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis

Add code
Feb 16, 2022
Figure 1 for Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis
Figure 2 for Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis
Figure 3 for Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis
Figure 4 for Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis
Viaarxiv icon

FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization

Add code
Apr 07, 2021
Figure 1 for FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization
Figure 2 for FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization
Figure 3 for FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization
Figure 4 for FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization
Viaarxiv icon

TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition

Add code
Apr 04, 2021
Figure 1 for TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition
Figure 2 for TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition
Figure 3 for TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition
Figure 4 for TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition
Viaarxiv icon