Picture for Hengshuang Zhao

Hengshuang Zhao

MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives

Add code
Dec 16, 2025
Viaarxiv icon

DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning

Add code
Dec 14, 2025
Figure 1 for DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning
Figure 2 for DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning
Figure 3 for DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning
Figure 4 for DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning
Viaarxiv icon

GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation

Add code
Dec 14, 2025
Viaarxiv icon

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Add code
Dec 09, 2025
Viaarxiv icon

Seg-VAR: Image Segmentation with Visual Autoregressive Modeling

Add code
Nov 16, 2025
Viaarxiv icon

Visual Spatial Tuning

Add code
Nov 07, 2025
Viaarxiv icon

From Noisy Traces to Stable Gradients: Bias-Variance Optimized Preference Optimization for Aligning Large Reasoning Models

Add code
Oct 06, 2025
Viaarxiv icon

DiffCamera: Arbitrary Refocusing on Images

Add code
Sep 30, 2025
Figure 1 for DiffCamera: Arbitrary Refocusing on Images
Figure 2 for DiffCamera: Arbitrary Refocusing on Images
Figure 3 for DiffCamera: Arbitrary Refocusing on Images
Figure 4 for DiffCamera: Arbitrary Refocusing on Images
Viaarxiv icon

Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search

Add code
Sep 09, 2025
Figure 1 for Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search
Figure 2 for Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search
Figure 3 for Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search
Figure 4 for Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search
Viaarxiv icon

ROSE: Remove Objects with Side Effects in Videos

Add code
Aug 26, 2025
Viaarxiv icon