Picture for Xiatian Zhu

Xiatian Zhu

From Senses to Decisions: The Information Flow of Auditory and Visual Perception in Multimodal LLMs

Add code
Jun 08, 2026
Viaarxiv icon

Beyond Consistency: Preserving Temporal Structure in Zero-Shot Video Editing

Add code
Jun 07, 2026
Viaarxiv icon

MMDG-Bench: A Benchmark for Multimodal Domain Generalization

Add code
May 30, 2026
Viaarxiv icon

From Imagined Futures to Executable Actions: Mixture of Latent Actions for Robot Manipulation

Add code
May 12, 2026
Viaarxiv icon

Reward-Guided Semantic Evolution for Test-time Adaptive Object Detection

Add code
May 06, 2026
Viaarxiv icon

FACTOR: Counterfactual Training-Free Test-Time Adaptation for Open-Vocabulary Object Detection

Add code
May 05, 2026
Viaarxiv icon

Source-Free Domain Adaptation with Vision-Language Prior

Add code
Apr 20, 2026
Viaarxiv icon

Uni-World VLA: Interleaved World Modeling and Planning for Autonomous Driving

Add code
Mar 28, 2026
Viaarxiv icon

Fast Low-light Enhancement and Deblurring for 3D Dark Scenes

Add code
Mar 09, 2026
Viaarxiv icon

RAID: Retrieval-Augmented Anomaly Detection

Add code
Feb 23, 2026
Viaarxiv icon