Accurate rotation estimation is at the heart of robot perception tasks such as visual odometry and object pose estimation. Deep neural networks have provided a new way to perform these tasks, and the choice of rotation representation is an important part of network design. In this work, we present a novel symmetric matrix representation of the 3D rotation group, SO(3), with two important properties that make it particularly suitable for learned models: (1) it satisfies a smoothness property that improves convergence and generalization when regressing large rotation targets, and (2) it encodes a symmetric Bingham belief over the space of unit quaternions, permitting the training of uncertainty-aware models. We empirically validate the benefits of our formulation by training deep neural rotation regressors on two data modalities. First, we use synthetic point-cloud data to show that our representation leads to superior predictive accuracy over existing representations for arbitrary rotation targets. Second, we use image data collected onboard ground and aerial vehicles to demonstrate that our representation is amenable to an effective out-of-distribution (OOD) rejection technique that significantly improves the robustness of rotation estimates to unseen environmental effects and corrupted input images, without requiring the use of an explicit likelihood loss, stochastic sampling, or an auxiliary classifier. This capability is key for safety-critical applications where detecting novel inputs can prevent catastrophic failure of learned models.
Correct fusion of data from two sensors is not possible without an accurate estimate of their relative pose, which can be determined through the process of extrinsic calibration. When two or more sensors are capable of producing their own egomotion estimates (i.e., measurements of their trajectories through an environment), the 'hand-eye' formulation of extrinsic calibration can be employed. In this paper, we extend our recent work on a convex optimization approach for hand-eye calibration to the case where one of the sensors cannot observe the scale of its translational motion (e.g., a monocular camera observing an unmapped environment). We prove that our technique is able to provide a certifiably globally optimal solution to both the known- and unknown-scale variants of hand-eye calibration, provided that the measurement noise is bounded. Herein, we focus on the theoretical aspects of the problem, show the tightness and stability of our solution, and demonstrate the optimality and speed of our algorithm through experiments with synthetic data.
Inverse kinematics is a fundamental problem for articulated robots: fast and accurate algorithms are needed for translating task-related workspace constraints and goals into feasible joint configurations. In general, inverse kinematics for serial kinematic chains is a difficult nonlinear problem, for which closed form solutions cannot be easily obtained. Therefore, computationally efficient numerical methods that can be adapted to a general class of manipulators are of great importance. % to motion planning and workspace generation tasks. In this paper, we use convex optimization techniques to solve the inverse kinematics problem with joint limit constraints for highly redundant serial kinematic chains with spherical joints in two and three dimensions. This is accomplished through a novel formulation of inverse kinematics as a nearest point problem, and with a fast sum of squares solver that exploits the sparsity of kinematic constraints for serial manipulators. Our method has the advantages of post-hoc certification of global optimality and a runtime that scales polynomialy with the number of degrees of freedom. Additionally, we prove that our convex relaxation leads to a globally optimal solution when certain conditions are met, and demonstrate empirically that these conditions are common and represent many practical instances. Finally, we provide an open source implementation of our algorithm.
Estimating unknown rotations from noisy measurements is an important step in SfM and other 3D vision tasks. Typically, local optimization methods susceptible to returning suboptimal local minima are used to solve the rotation averaging problem. A new wave of approaches that leverage convex relaxations have provided the first formal guarantees of global optimality for state estimation techniques involving SO(3). However, most of these guarantees are only applicable when the measurement error introduced by noise is within a certain bound that depends on the problem instance's structure. In this paper, we cast rotation averaging as a polynomial optimization problem over unit quaternions to produce the first rotation averaging method that is formally guaranteed to provide a certifiably globally optimal solution for \textit{any} problem instance. This is achieved by formulating and solving a sparse convex sum of squares (SOS) relaxation of the problem. We provide an open source implementation of our algorithm and experiments, demonstrating the benefits of our globally optimal approach.
Accurate estimates of rotation are crucial to vision-based motion estimation in augmented reality and robotics. In this work, we present a method to extract probabilistic estimates of rotation from deep regression models. First, we build on prior work and argue that a multi-headed network structure we name HydraNet provides better calibrated uncertainty estimates than methods that rely on stochastic forward passes. Second, we extend HydraNet to targets that belong to the rotation group, SO(3), by regressing unit quaternions and using the tools of rotation averaging and uncertainty injection onto the manifold to produce three-dimensional covariances. Finally, we present results and analysis on a synthetic dataset, learn consistent orientation estimates on the 7-Scenes dataset, and show how we can use our learned covariances to fuse deep estimates of relative orientation with classical stereo visual odometry to improve localization on the KITTI dataset.
We present a certifiably globally optimal algorithm for determining the extrinsic calibration between two sensors that are capable of producing independent egomotion estimates. This problem has been previously solved using a variety of techniques, including local optimization approaches that have no formal global optimality guarantees. We use a quadratic objective function to formulate calibration as a quadratically constrained quadratic program (QCQP). By leveraging recent advances in the optimization of QCQPs, we are able to use existing semidefinite program (SDP) solvers to obtain a certifiably global optimum via the Lagrangian dual problem. Our problem formulation can be globally optimized by existing general-purpose solvers in less than a second, regardless of the number of measurements available and the noise level. This enables a variety of robotic platforms to rapidly and robustly compute and certify a globally optimal set of calibration parameters without a prior estimate or operator intervention. We compare the performance of our approach with a local solver on extensive simulations and multiple real datasets. Finally, we present necessary observability conditions that connect our approach to recent theoretical results and analytically support the empirical performance of our system.
This paper explores the use of an entropy-based technique for point cloud reconstruction with the goal of calibrating a lidar to a sensor capable of providing egomotion information. We extend recent work in this area to the problem of recovering the $Sim(3)$ transformation between a 2D lidar and a rigidly attached monocular camera, where the scale of the camera trajectory is not known a priori. We demonstrate the robustness of our approach on realistic simulations in multiple environments, as well as on data collected from a hand-held sensor rig. Given a non-degenerate trajectory and a sufficient number of lidar measurements, our calibration procedure achieves millimetre-scale and sub-degree accuracy. Moreover, our method relaxes the need for specific scene geometry, fiducial markers, or overlapping sensor fields of view, which had previously limited similar techniques.
Inter-robot loop closure detection is a core problem in collaborative SLAM (CSLAM). Establishing inter-robot loop closures is a resource-demanding process, during which robots must consume a substantial amount of mission-critical resources (e.g., battery and bandwidth) to exchange sensory data. However, even with the most resource-efficient techniques, the resources available onboard may be insufficient for verifying every potential loop closure. This work addresses this critical challenge by proposing a resource-adaptive framework for distributed loop closure detection. We seek to maximize task-oriented objectives subject to a budget constraint on total data transmission. This problem is in general NP-hard. We approach this problem from different perspectives and leverage existing results on monotone submodular maximization to provide efficient approximation algorithms with performance guarantees. The proposed approach is extensively evaluated using the KITTI odometry benchmark dataset and synthetic Manhattan-like datasets.
Due to the distributed nature of cooperative simultaneous localization and mapping (CSLAM), detecting inter-robot loop closures necessitates sharing sensory data with other robots. A na\"{\i}ve approach to data sharing can easily lead to a waste of mission-critical resources. This paper investigates the logistical aspects of CSLAM. Particularly, we present a general resource-efficient communication planning framework that takes into account both the total amount of exchanged data and the induced division of labor between the participating robots. Compared to other state-of-the-art approaches, our framework is able to verify the same set of potential inter-robot loop closures while exchanging considerably less data and influencing the induced workloads. We develop a fast algorithm for finding globally optimal communication policies, and present theoretical analysis to characterize the necessary and sufficient conditions under which simpler strategies are optimal. The proposed framework is extensively evaluated with data from the KITTI odometry benchmark datasets.