Picture for Zhen-Hua Ling

Zhen-Hua Ling

A Study of the Removability of Speaker-Adversarial Perturbations

Add code
Oct 10, 2025
Figure 1 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 2 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 3 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 4 for A Study of the Removability of Speaker-Adversarial Perturbations
Viaarxiv icon

DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis

Add code
Sep 18, 2025
Figure 1 for DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
Figure 2 for DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
Figure 3 for DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
Viaarxiv icon

Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding

Add code
Sep 04, 2025
Viaarxiv icon

Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?

Add code
Aug 11, 2025
Viaarxiv icon

Vision-Integrated High-Quality Neural Speech Coding

Add code
May 29, 2025
Figure 1 for Vision-Integrated High-Quality Neural Speech Coding
Figure 2 for Vision-Integrated High-Quality Neural Speech Coding
Figure 3 for Vision-Integrated High-Quality Neural Speech Coding
Figure 4 for Vision-Integrated High-Quality Neural Speech Coding
Viaarxiv icon

Beyond Manual Transcripts: The Potential of Automated Speech Recognition Errors in Improving Alzheimer's Disease Detection

Add code
May 26, 2025
Viaarxiv icon

Decoding Speaker-Normalized Pitch from EEG for Mandarin Perception

Add code
May 26, 2025
Viaarxiv icon

Leveraging Cascaded Binary Classification and Multimodal Fusion for Dementia Detection through Spontaneous Speech

Add code
May 26, 2025
Figure 1 for Leveraging Cascaded Binary Classification and Multimodal Fusion for Dementia Detection through Spontaneous Speech
Figure 2 for Leveraging Cascaded Binary Classification and Multimodal Fusion for Dementia Detection through Spontaneous Speech
Figure 3 for Leveraging Cascaded Binary Classification and Multimodal Fusion for Dementia Detection through Spontaneous Speech
Viaarxiv icon

Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising

Add code
May 22, 2025
Viaarxiv icon

The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan

Add code
May 14, 2025
Viaarxiv icon