Picture for Zhen-Hua Ling

Zhen-Hua Ling

Dynamic Prosody Prediction in LLM-based TTS for Improving Speaker Similarity

Add code
Jun 13, 2026
Viaarxiv icon

DuraMark: Duration-Embedded Watermarking in LLM-based TTS

Add code
Jun 13, 2026
Viaarxiv icon

CoSTA: Cognitive-State-Conditioned TTS Data Augmentation Using ASR Transcripts for Alzheimer's Disease Detection

Add code
Jun 04, 2026
Viaarxiv icon

VoCodec: A Low-bitrate Streamable Neural Speech Codec with Voicing-driven Quantization

Add code
Jun 04, 2026
Viaarxiv icon

Beyond WER: A Paired Acoustic Stress Test for Ambient Clinical Scribes

Add code
Jun 04, 2026
Viaarxiv icon

An Ultra-Low-Bitrate Neural Speech Codec with Plain-to-Pseudo Synergistic Vector Quantization

Add code
Jun 04, 2026
Viaarxiv icon

CFMDCTCodec: A Low-Bitrate Neural Speech Codec with Noise-Prior-aware Conditional Flow Matching for MDCT-Spectral Enhancement

Add code
May 26, 2026
Viaarxiv icon

Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction

Add code
May 25, 2026
Viaarxiv icon

CodeSep: Low-Bitrate Codec-Driven Speech Separation with Base-Token Disentanglement and Auxiliary-Token Serial Prediction

Add code
Jan 19, 2026
Viaarxiv icon

Multiplicative Orthogonal Sequential Editing for Language Models

Add code
Jan 11, 2026
Viaarxiv icon