Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vincent Lepetit

GuidedRec: Guiding Ill-Posed Unsupervised Volumetric Recovery

May 20, 2024

Alexandre Cafaro, Amaury Leroy, Guillaume Beldjoudi, Pauline Maury, Charlotte Robert, Eric Deutsch, Vincent Grégoire, Vincent Lepetit, Nikos Paragios

Figure 1 for GuidedRec: Guiding Ill-Posed Unsupervised Volumetric Recovery

Figure 2 for GuidedRec: Guiding Ill-Posed Unsupervised Volumetric Recovery

Figure 3 for GuidedRec: Guiding Ill-Posed Unsupervised Volumetric Recovery

Figure 4 for GuidedRec: Guiding Ill-Posed Unsupervised Volumetric Recovery

Abstract:We introduce a novel unsupervised approach to reconstructing a 3D volume from only two planar projections that exploits a previous\-ly-captured 3D volume of the patient. Such volume is readily available in many important medical procedures and previous methods already used such a volume. Earlier methods that work by deforming this volume to match the projections typically fail when the number of projections is very low as the alignment becomes underconstrained. We show how to use a generative model of the volume structures to constrain the deformation and obtain a correct estimate. Moreover, our method is not bounded to a specific sensor calibration and can be applied to new calibrations without retraining. We evaluate our approach on a challenging dataset and show it outperforms state-of-the-art methods. As a result, our method could be used in treatment scenarios such as surgery and radiotherapy while drastically reducing patient radiation exposure.

Via

Access Paper or Ask Questions

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Apr 16, 2024

Sinisa Stekovic, Stefan Ainetter, Mattia D'Urso, Friedrich Fraundorfer, Vincent Lepetit

Figure 1 for PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Figure 2 for PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Figure 3 for PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Figure 4 for PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Abstract:We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects from images using interpretable shape programs. In comparison to traditional CAD model retrieval methods, the use of shape programs for 3D reconstruction allows for reasoning about the semantic properties of reconstructed objects, editing, low memory footprint, etc. However, the utilization of shape programs for 3D scene understanding has been largely neglected in past works. As our main contribution, we enable gradient-based optimization by introducing a module that translates shape programs designed in Blender, for example, into efficient PyTorch code. We also provide a method that relies on PyTorchGeoNodes and is inspired by Monte Carlo Tree Search (MCTS) to jointly optimize discrete and continuous parameters of shape programs and reconstruct 3D objects for input scenes. In our experiments, we apply our algorithm to reconstruct 3D objects in the ScanNet dataset and evaluate our results against CAD model retrieval-based reconstructions. Our experiments indicate that our reconstructions match well the input scenes while enabling semantic reasoning about reconstructed objects.

* In Submission

Via

Access Paper or Ask Questions

Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering

Mar 21, 2024

Antoine Guédon, Vincent Lepetit

Abstract:We propose Gaussian Frosting, a novel mesh-based representation for high-quality rendering and editing of complex 3D effects in real-time. Our approach builds on the recent 3D Gaussian Splatting framework, which optimizes a set of 3D Gaussians to approximate a radiance field from images. We propose first extracting a base mesh from Gaussians during optimization, then building and refining an adaptive layer of Gaussians with a variable thickness around the mesh to better capture the fine details and volumetric effects near the surface, such as hair or grass. We call this layer Gaussian Frosting, as it resembles a coating of frosting on a cake. The fuzzier the material, the thicker the frosting. We also introduce a parameterization of the Gaussians to enforce them to stay inside the frosting layer and automatically adjust their parameters when deforming, rescaling, editing or animating the mesh. Our representation allows for efficient rendering using Gaussian splatting, as well as editing and animation by modifying the base mesh. We demonstrate the effectiveness of our method on various synthetic and real scenes, and show that it outperforms existing surface-based approaches. We will release our code and a web-based viewer as additional contributions. Our project page is the following: https://anttwo.github.io/frosting/

* Project Webpage: https://anttwo.github.io/frosting/

Via

Access Paper or Ask Questions

BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects

Mar 14, 2024

Tomas Hodan, Martin Sundermeyer, Yann Labbe, Van Nguyen Nguyen, Gu Wang, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Jiri Matas

Figure 1 for BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects

Figure 2 for BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects

Figure 3 for BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects

Figure 4 for BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects

Abstract:We present the evaluation methodology, datasets and results of the BOP Challenge 2023, the fifth in a series of public competitions organized to capture the state of the art in model-based 6D object pose estimation from an RGB/RGB-D image and related tasks. Besides the three tasks from 2022 (model-based 2D detection, 2D segmentation, and 6D localization of objects seen during training), the 2023 challenge introduced new variants of these tasks focused on objects unseen during training. In the new tasks, methods were required to learn new objects during a short onboarding stage (max 5 minutes, 1 GPU) from provided 3D object models. The best 2023 method for 6D localization of unseen objects (GenFlow) notably reached the accuracy of the best 2020 method for seen objects (CosyPose), although being noticeably slower. The best 2023 method for seen objects (GPose) achieved a moderate accuracy improvement but a significant 43% run-time improvement compared to the best 2022 counterpart (GDRNPP). Since 2017, the accuracy of 6D localization of seen objects has improved by more than 50% (from 56.9 to 85.6 AR_C). The online evaluation system stays open and is available at: http://bop.felk.cvut.cz/.

* arXiv admin note: substantial text overlap with arXiv:2302.13075

Via

Access Paper or Ask Questions

Correspondences of the Third Kind: Camera Pose Estimation from Object Reflection

Dec 07, 2023

Kohei Yamashita, Vincent Lepetit, Ko Nishino

Abstract:Computer vision has long relied on two kinds of correspondences: pixel correspondences in images and 3D correspondences on object surfaces. Is there another kind, and if there is, what can they do for us? In this paper, we introduce correspondences of the third kind we call reflection correspondences and show that they can help estimate camera pose by just looking at objects without relying on the background. Reflection correspondences are point correspondences in the reflected world, i.e., the scene reflected by the object surface. The object geometry and reflectance alters the scene geometrically and radiometrically, respectively, causing incorrect pixel correspondences. Geometry recovered from each image is also hampered by distortions, namely generalized bas-relief ambiguity, leading to erroneous 3D correspondences. We show that reflection correspondences can resolve the ambiguities arising from these distortions. We introduce a neural correspondence estimator and a RANSAC algorithm that fully leverages all three kinds of correspondences for robust and accurate joint camera pose and object shape estimation just from the object appearance. The method expands the horizon of numerous downstream tasks, including camera pose estimation for appearance modeling (e.g., NeRF) and motion estimation of reflective objects (e.g., cars on the road), to name a few, as it relieves the requirement of overlapping background.

Via

Access Paper or Ask Questions

SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Dec 02, 2023

Antoine Guédon, Vincent Lepetit

Figure 1 for SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Figure 2 for SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Figure 3 for SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Figure 4 for SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Abstract:We propose a method to allow precise and extremely fast mesh extraction from 3D Gaussian Splatting. Gaussian Splatting has recently become very popular as it yields realistic rendering while being significantly faster to train than NeRFs. It is however challenging to extract a mesh from the millions of tiny 3D gaussians as these gaussians tend to be unorganized after optimization and no method has been proposed so far. Our first key contribution is a regularization term that encourages the gaussians to align well with the surface of the scene. We then introduce a method that exploits this alignment to extract a mesh from the Gaussians using Poisson reconstruction, which is fast, scalable, and preserves details, in contrast to the Marching Cubes algorithm usually applied to extract meshes from Neural SDFs. Finally, we introduce an optional refinement strategy that binds gaussians to the surface of the mesh, and jointly optimizes these Gaussians and the mesh through Gaussian splatting rendering. This enables easy editing, sculpting, rigging, animating, compositing and relighting of the Gaussians using traditional softwares by manipulating the mesh instead of the gaussians themselves. Retrieving such an editable mesh for realistic rendering is done within minutes with our method, compared to hours with the state-of-the-art methods on neural SDFs, while providing a better rendering quality. Our project page is the following: https://anttwo.github.io/sugar/

* We identified a minor typographical error in Equation 6; We updated the paper accordingly. Project Webpage: https://anttwo.github.io/sugar/

Via

Access Paper or Ask Questions

GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence

Nov 23, 2023

Van Nguyen Nguyen, Thibault Groueix, Mathieu Salzmann, Vincent Lepetit

Abstract:We present GigaPose, a fast, robust, and accurate method for CAD-based novel object pose estimation in RGB images. GigaPose first leverages discriminative templates, rendered images of the CAD models, to recover the out-of-plane rotation and then uses patch correspondences to estimate the four remaining parameters. Our approach samples templates in only a two-degrees-of-freedom space instead of the usual three and matches the input image to the templates using fast nearest neighbor search in feature space, results in a speedup factor of 38x compared to the state of the art. Moreover, GigaPose is significantly more robust to segmentation errors. Our extensive evaluation on the seven core datasets of the BOP challenge demonstrates that it achieves state-of-the-art accuracy and can be seamlessly integrated with a refinement method. Additionally, we show the potential of GigaPose with 3D models predicted by recent work on 3D reconstruction from a single image, relaxing the need for CAD models and making 6D pose object estimation much more convenient. Our source code and trained models are publicly available at https://github.com/nv-nguyen/gigaPose

Via

Access Paper or Ask Questions

BEVContrast: Self-Supervision in BEV Space for Automotive Lidar Point Clouds

Oct 26, 2023

Corentin Sautier, Gilles Puy, Alexandre Boulch, Renaud Marlet, Vincent Lepetit

Abstract:We present a surprisingly simple and efficient method for self-supervision of 3D backbone on automotive Lidar point clouds. We design a contrastive loss between features of Lidar scans captured in the same scene. Several such approaches have been proposed in the literature from PointConstrast, which uses a contrast at the level of points, to the state-of-the-art TARL, which uses a contrast at the level of segments, roughly corresponding to objects. While the former enjoys a great simplicity of implementation, it is surpassed by the latter, which however requires a costly pre-processing. In BEVContrast, we define our contrast at the level of 2D cells in the Bird's Eye View plane. Resulting cell-level representations offer a good trade-off between the point-level representations exploited in PointContrast and segment-level representations exploited in TARL: we retain the simplicity of PointContrast (cell representations are cheap to compute) while surpassing the performance of TARL in downstream semantic segmentation.

* Accepted to 3DV 2024

Via

Access Paper or Ask Questions

HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans

Sep 12, 2023

Stefan Ainetter, Sinisa Stekovic, Friedrich Fraundorfer, Vincent Lepetit

Figure 1 for HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans

Figure 2 for HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans

Figure 3 for HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans

Figure 4 for HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans

Abstract:We present an automated and efficient approach for retrieving high-quality CAD models of objects and their poses in a scene captured by a moving RGB-D camera. We first investigate various objective functions to measure similarity between a candidate CAD object model and the available data, and the best objective function appears to be a "render-and-compare" method comparing depth and mask rendering. We thus introduce a fast-search method that approximates an exhaustive search based on this objective function for simultaneously retrieving the object category, a CAD model, and the pose of an object given an approximate 3D bounding box. This method involves a search tree that organizes the CAD models and object properties including object category and pose for fast retrieval and an algorithm inspired by Monte Carlo Tree Search, that efficiently searches this tree. We show that this method retrieves CAD models that fit the real objects very well, with a speed-up factor of 10x to 120x compared to exhaustive search.

Via

Access Paper or Ask Questions

CNOS: A Strong Baseline for CAD-based Novel Object Segmentation

Aug 03, 2023

Van Nguyen Nguyen, Tomas Hodan, Georgy Ponimatkin, Thibault Groueix, Vincent Lepetit

Figure 1 for CNOS: A Strong Baseline for CAD-based Novel Object Segmentation

Figure 2 for CNOS: A Strong Baseline for CAD-based Novel Object Segmentation

Figure 3 for CNOS: A Strong Baseline for CAD-based Novel Object Segmentation

Figure 4 for CNOS: A Strong Baseline for CAD-based Novel Object Segmentation

Abstract:We propose a simple three-stage approach to segment unseen objects in RGB images using their CAD models. Leveraging recent powerful foundation models, DINOv2 and Segment Anything, we create descriptors and generate proposals, including binary masks for a given input RGB image. By matching proposals with reference descriptors created from CAD models, we achieve precise object ID assignment along with modal masks. We experimentally demonstrate that our method achieves state-of-the-art results in CAD-based novel object segmentation, surpassing existing approaches on the seven core datasets of the BOP challenge by 19.8% AP using the same BOP evaluation protocol. Our source code is available at https://github.com/nv-nguyen/cnos.

Via

Access Paper or Ask Questions