Picture for Yanwei Fu

Yanwei Fu

ActiveVLA: Injecting Active Perception into Vision-Language-Action Models for Precise 3D Robotic Manipulation

Add code
Jan 13, 2026
Viaarxiv icon

MVGGT: Multimodal Visual Geometry Grounded Transformer for Multiview 3D Referring Expression Segmentation

Add code
Jan 13, 2026
Viaarxiv icon

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

Add code
Jan 08, 2026
Viaarxiv icon

FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing

Add code
Jan 06, 2026
Viaarxiv icon

DST-Calib: A Dual-Path, Self-Supervised, Target-Free LiDAR-Camera Extrinsic Calibration Network

Add code
Jan 03, 2026
Viaarxiv icon

Schrödinger's Navigator: Imagining an Ensemble of Futures for Zero-Shot Object Navigation

Add code
Dec 24, 2025
Viaarxiv icon

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Add code
Dec 15, 2025
Viaarxiv icon

One-Step Generative Policies with Q-Learning: A Reformulation of MeanFlow

Add code
Nov 17, 2025
Viaarxiv icon

VidSplice: Towards Coherent Video Inpainting via Explicit Spaced Frame Guidance

Add code
Oct 24, 2025
Viaarxiv icon

SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment

Add code
Aug 08, 2025
Viaarxiv icon