Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes

May 03, 2025

Seong Hyeon Park, Jinwoo Shin

Figure 1 for Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes

Figure 2 for Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes

Figure 3 for Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes

Figure 4 for Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes

Share this with someone who'll enjoy it:

Abstract:In monocular videos that capture dynamic scenes, estimating the 3D geometry of video contents has been a fundamental challenge in computer vision. Specifically, the task is significantly challenged by the object motion, where existing models are limited to predict only partial attributes of the dynamic scenes, such as depth or pointmaps spanning only over a pair of frames. Since these attributes are inherently noisy under multiple frames, test-time global optimizations are often employed to fully recover the geometry, which is liable to failure and incurs heavy inference costs. To address the challenge, we present a new model, coined MMP, to estimate the geometry in a feed-forward manner, which produces a dynamic pointmap representation that evolves over multiple frames. Specifically, based on the recent Siamese architecture, we introduce a new trajectory encoding module to project point-wise dynamics on the representation for each frame, which can provide significantly improved expressiveness for dynamic scenes. In our experiments, we find MMP can achieve state-of-the-art quality in feed-forward pointmap prediction, e.g., 15.1% enhancement in the regression error.

View paper on

Share this with someone who'll enjoy it:

Title:Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes

Paper and Code