Picture for Wei He

Wei He

and Other Contributors

InterDyad: Interactive Dyadic Speech-to-Video Generation by Querying Intermediate Visual Guidance

Add code
Mar 24, 2026
Viaarxiv icon

StreamingClaw Technical Report

Add code
Mar 23, 2026
Viaarxiv icon

MVHOI: Bridge Multi-view Condition to Complex Human-Object Interaction Video Reenactment via 3D Foundation Model

Add code
Mar 16, 2026
Viaarxiv icon

DISPLAY: Directable Human-Object Interaction Video Generation via Sparse Motion Guidance and Multi-Task Auxiliary

Add code
Mar 10, 2026
Viaarxiv icon

On Multi-Step Theorem Prediction via Non-Parametric Structural Priors

Add code
Mar 05, 2026
Viaarxiv icon

From Intuition to Investigation: A Tool-Augmented Reasoning MLLM Framework for Generalizable Face Anti-Spoofing

Add code
Mar 01, 2026
Viaarxiv icon

Reinforcing Real-world Service Agents: Balancing Utility and Cost in Task-oriented Dialogue

Add code
Feb 26, 2026
Viaarxiv icon

Multi-Modal Representation Learning via Semi-Supervised Rate Reduction for Generalized Category Discovery

Add code
Feb 23, 2026
Viaarxiv icon

CAPER: Constrained and Procedural Reasoning for Robotic Scientific Experiments

Add code
Feb 10, 2026
Viaarxiv icon

Differentiate-and-Inject: Enhancing VLAs via Functional Differentiation Induced by In-Parameter Structural Reasoning

Add code
Feb 07, 2026
Viaarxiv icon