Picture for Zhiyong Wu

Zhiyong Wu

Self-Guidance: Enhancing Neural Codecs via Decoder Manifold Alignment

Add code
Jun 11, 2026
Viaarxiv icon

Feature-Aligned Speech Watermarking for Robustness to Reconstruction Distortions

Add code
Jun 10, 2026
Viaarxiv icon

Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields

Add code
Jun 09, 2026
Viaarxiv icon

Bypassing Copyright Protection in Diffusion-based Customization via Two-Stage Latent Feature Optimization

Add code
Jun 06, 2026
Viaarxiv icon

VoxCPM2 Technical Report

Add code
Jun 05, 2026
Viaarxiv icon

LoSATok: Low-dimensional Semantic-Acoustic Tokenizer for Cross-Domain Audio Understanding and Generation

Add code
May 27, 2026
Viaarxiv icon

UniSRM: A Unified Speech Reward Model for Reasoning-Based Fine-grained Assessment

Add code
May 22, 2026
Viaarxiv icon

OpenCompass: A Universal Evaluation Platform for Large Language Models

Add code
May 19, 2026
Viaarxiv icon

How Should LLMs Listen While Speaking? A Study of User-Stream Routing in Full-Duplex Spoken Dialogue

Add code
May 11, 2026
Viaarxiv icon

TTS-PRISM: A Perceptual Reasoning and Interpretable Speech Model for Fine-Grained Diagnosis

Add code
Apr 24, 2026
Viaarxiv icon