Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

May 08, 2025

Qitao Zhao, Amy Lin, Jeff Tan, Jason Y. Zhang, Deva Ramanan, Shubham Tulsiani

Figure 1 for DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Figure 2 for DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Figure 3 for DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Figure 4 for DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Share this with someone who'll enjoy it:

Abstract:Current Structure-from-Motion (SfM) methods typically follow a two-stage pipeline, combining learned or geometric pairwise reasoning with a subsequent global optimization step. In contrast, we propose a data-driven multi-view reasoning approach that directly infers 3D scene geometry and camera poses from multi-view images. Our framework, DiffusionSfM, parameterizes scene geometry and cameras as pixel-wise ray origins and endpoints in a global frame and employs a transformer-based denoising diffusion model to predict them from multi-view inputs. To address practical challenges in training diffusion models with missing data and unbounded scene coordinates, we introduce specialized mechanisms that ensure robust learning. We empirically validate DiffusionSfM on both synthetic and real datasets, demonstrating that it outperforms classical and learning-based approaches while naturally modeling uncertainty.

* CVPR 2025. Project website: https://qitaozhao.github.io/DiffusionSfM

View paper on

Share this with someone who'll enjoy it:

Title:DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Paper and Code