Picture for Haizhou Li

Haizhou Li

Bridging What the Model Thinks and How It Speaks: Self-Aware Speech Language Models for Expressive Speech Generation

Add code
Apr 13, 2026
Viaarxiv icon

AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis

Add code
Apr 05, 2026
Viaarxiv icon

PhiNet: Speaker Verification with Phonetic Interpretability

Add code
Apr 02, 2026
Viaarxiv icon

Controllable Accent Normalization via Discrete Diffusion

Add code
Mar 15, 2026
Viaarxiv icon

AlphaFlowTSE: One-Step Generative Target Speaker Extraction via Conditional AlphaFlow

Add code
Mar 11, 2026
Viaarxiv icon

TP-Spikformer: Token Pruned Spiking Transformer

Add code
Feb 28, 2026
Viaarxiv icon

Discourse-Aware Dual-Track Streaming Response for Low-Latency Spoken Dialogue Systems

Add code
Feb 26, 2026
Viaarxiv icon

Robust Spiking Neural Networks Against Adversarial Attacks

Add code
Feb 24, 2026
Viaarxiv icon

CosyAccent: Duration-Controllable Accent Normalization Using Source-Synthesis Training Data

Add code
Feb 22, 2026
Viaarxiv icon

AudioRAG: A Challenging Benchmark for Audio Reasoning and Information Retrieval

Add code
Feb 11, 2026
Viaarxiv icon