Picture for Cong Wang

Cong Wang

Zhejiang University, Hangzhou, China

Future Forcing: Future-aware Training-free KV Cache Policy for Autoregressive Video Generation

Add code
May 28, 2026
Viaarxiv icon

A Signal-Language Foundation Model for Broad-Spectrum Cardiovascular Assessment from Routine Electrocardiography

Add code
May 25, 2026
Viaarxiv icon

SpongeBob: Sync-Aware Harmonious Audio-Visual Generative Editing

Add code
May 24, 2026
Viaarxiv icon

Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding

Add code
May 19, 2026
Viaarxiv icon

GTA: Advancing Image-to-3D World Generation via Geometry Then Appearance Video Diffusion

Add code
May 13, 2026
Viaarxiv icon

What Happens Inside Agent Memory? Circuit Analysis from Emergence to Diagnosis

Add code
May 05, 2026
Viaarxiv icon

Embody4D: A Generalist 4D World Model for Embodied AI

Add code
May 03, 2026
Viaarxiv icon

Denoise and Align: Diffusion-Driven Foreground Knowledge Prompting for Open-Vocabulary Temporal Action Detection

Add code
Apr 20, 2026
Viaarxiv icon

HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

Add code
Apr 14, 2026
Viaarxiv icon

VersaVogue: Visual Expert Orchestration and Preference Alignment for Unified Fashion Synthesis

Add code
Apr 08, 2026
Viaarxiv icon