Picture for Cong Wang

Cong Wang

Zhejiang University, Hangzhou, China

What Happens Inside Agent Memory? Circuit Analysis from Emergence to Diagnosis

Add code
May 05, 2026
Viaarxiv icon

Embody4D: A Generalist 4D World Model for Embodied AI

Add code
May 03, 2026
Viaarxiv icon

Denoise and Align: Diffusion-Driven Foreground Knowledge Prompting for Open-Vocabulary Temporal Action Detection

Add code
Apr 20, 2026
Viaarxiv icon

HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

Add code
Apr 14, 2026
Viaarxiv icon

VersaVogue: Visual Expert Orchestration and Preference Alignment for Unified Fashion Synthesis

Add code
Apr 08, 2026
Viaarxiv icon

See the Forest for the Trees: Loosely Speculative Decoding via Visual-Semantic Guidance for Efficient Inference of Video LLMs

Add code
Apr 07, 2026
Viaarxiv icon

ParallelVLM: Lossless Video-LLM Acceleration with Visual Alignment Aware Parallel Speculative Decoding

Add code
Mar 23, 2026
Viaarxiv icon

PhysVideo: Physically Plausible Video Generation with Cross-View Geometry Guidance

Add code
Mar 19, 2026
Viaarxiv icon

Training-Free Sparse Attention for Fast Video Generation via Offline Layer-Wise Sparsity Profiling and Online Bidirectional Co-Clustering

Add code
Mar 19, 2026
Viaarxiv icon

ARES: Scalable and Practical Gradient Inversion Attack in Federated Learning through Activation Recovery

Add code
Mar 18, 2026
Viaarxiv icon