Picture for Zhiwen Fan

Zhiwen Fan

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding

Add code
Jul 16, 2025
Viaarxiv icon

Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions

Add code
Jul 10, 2025
Viaarxiv icon

CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy

Add code
Jun 06, 2025
Viaarxiv icon

VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction

Add code
May 26, 2025
Viaarxiv icon

Generative AI for Autonomous Driving: Frontiers and Opportunities

Add code
May 13, 2025
Viaarxiv icon

Steepest Descent Density Control for Compact 3D Gaussian Splatting

Add code
May 08, 2025
Viaarxiv icon

Can Test-Time Scaling Improve World Foundation Model?

Add code
Mar 31, 2025
Viaarxiv icon

X$^{2}$-Gaussian: 4D Radiative Gaussian Splatting for Continuous-time Tomographic Reconstruction

Add code
Mar 27, 2025
Viaarxiv icon

Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields

Add code
Mar 26, 2025
Viaarxiv icon

VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment

Add code
Jan 03, 2025
Figure 1 for VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
Figure 2 for VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
Figure 3 for VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
Figure 4 for VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
Viaarxiv icon