Picture for Zhi Wang

Zhi Wang

Moore Threads

What Drives Test-Time Adaptation for CLIP? A Controlled Empirical Study from an Update Perspective

Add code
Jun 12, 2026
Viaarxiv icon

HDSL: A Hierarchical Domain-Specific Language for Structured 3D Indoor Scene Generation and Localized Editing with LLM Agents

Add code
Jun 08, 2026
Viaarxiv icon

ElegantVLA: Learning When to Think for Efficient Vision-Language-Action Models

Add code
May 28, 2026
Viaarxiv icon

HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos

Add code
May 28, 2026
Viaarxiv icon

Test-time Sparsity for Extreme Fast Action Diffusion

Add code
May 13, 2026
Viaarxiv icon

InkDrop: Invisible Backdoor Attacks Against Dataset Condensation

Add code
Mar 30, 2026
Viaarxiv icon

FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair

Add code
Mar 18, 2026
Viaarxiv icon

RoboStream: Weaving Spatio-Temporal Reasoning with Memory in Vision-Language Models for Robotics

Add code
Mar 13, 2026
Viaarxiv icon

Thinking in Dynamics: How Multimodal Large Language Models Perceive, Track, and Reason Dynamics in Physical 4D World

Add code
Mar 13, 2026
Viaarxiv icon

StructBiHOI: Structured Articulation Modeling for Long--Horizon Bimanual Hand--Object Interaction Generation

Add code
Mar 10, 2026
Viaarxiv icon