Picture for Wenbo Hu

Wenbo Hu

Pixal3D: Pixel-Aligned 3D Generation from Images

Add code
May 11, 2026
Viaarxiv icon

Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers

Add code
Apr 23, 2026
Viaarxiv icon

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

Add code
Apr 09, 2026
Viaarxiv icon

Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

Add code
Apr 01, 2026
Viaarxiv icon

Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels

Add code
Mar 05, 2026
Viaarxiv icon

MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

Add code
Feb 09, 2026
Viaarxiv icon

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

Add code
Jan 08, 2026
Viaarxiv icon

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Add code
Dec 11, 2025
Viaarxiv icon

Bridging the Gap Between Bayesian Deep Learning and Ensemble Weather Forecasts

Add code
Nov 18, 2025
Viaarxiv icon

Interleaving Reasoning for Better Text-to-Image Generation

Add code
Sep 09, 2025
Figure 1 for Interleaving Reasoning for Better Text-to-Image Generation
Figure 2 for Interleaving Reasoning for Better Text-to-Image Generation
Figure 3 for Interleaving Reasoning for Better Text-to-Image Generation
Figure 4 for Interleaving Reasoning for Better Text-to-Image Generation
Viaarxiv icon