Abstract:Production of photorealistic, navigable 3D site models requires a large volume of carefully collected images that are often unavailable to first responders for disaster relief or law enforcement. Real-world challenges include limited numbers of images, heterogeneous unposed cameras, inconsistent lighting, and extreme viewpoint differences for images collected from varying altitudes. To promote research aimed at addressing these challenges, we have developed the first public benchmark dataset for 3D reconstruction and novel view synthesis based on multiple calibrated ground-level, security-level, and airborne cameras. We present datasets that pose real-world challenges, independently evaluate calibration of unposed cameras and quality of novel rendered views, demonstrate baseline performance using recent state-of-practice methods, and identify challenges for further research.
Abstract:Urban 3D modeling from satellite images requires accurate semantic segmentation to delineate urban features, multiple view stereo for 3D reconstruction of surface heights, and 3D model fitting to produce compact models with accurate surface slopes. In this work, we present a cumulative assessment metric that succinctly captures error contributions from each of these components. We demonstrate our approach by providing challenging public datasets and extending two open source projects to provide an end-to-end 3D modeling baseline solution to stimulate further research and evaluation with a public leaderboard.
Abstract:The increasingly common use of incidental satellite images for stereo reconstruction versus rigidly tasked binocular or trinocular coincident collection is helping to enable timely global-scale 3D mapping; however, reliable stereo correspondence from multi-date image pairs remains very challenging due to seasonal appearance differences and scene change. Promising recent work suggests that semantic scene segmentation can provide a robust regularizing prior for resolving ambiguities in stereo correspondence and reconstruction problems. To enable research for pairwise semantic stereo and multi-view semantic 3D reconstruction with incidental satellite images, we have established a large-scale public dataset including multi-view, multi-band satellite images and ground truth geometric and semantic labels for two large cities. To demonstrate the complementary nature of the stereo and segmentation tasks, we present lightweight public baselines adapted from recent state of the art convolutional neural network models and assess their performance.