Abstract:Visual-inertial odometry (VIO) is an important technology for autonomous robots with power and payload constraints. In this paper, we propose a novel approach for VIO with stereo cameras which integrates and calibrates the velocity-control based kinematic motion model of wheeled mobile robots online. Including such a motion model can help to improve the accuracy of VIO. Compared to several previous approaches proposed to integrate wheel odometer measurements for this purpose, our method does not require wheel encoders and can be applied when the robot motion can be modeled with velocity-control based kinematic motion model. We use radial basis function (RBF) kernels to compensate for the time delay and deviations between control commands and actual robot motion. The motion model is calibrated online by the VIO system and can be used as a forward model for motion control and planning. We evaluate our approach with data obtained in variously sized indoor environments, demonstrate improvements over a pure VIO method, and evaluate the prediction accuracy of the online calibrated model.
Abstract:In this paper, we analyze the observability of the visual-inertial odometry (VIO) using stereo cameras with a velocity-control based kinematic motion model. Previous work shows that in general case the global position and yaw are unobservable in VIO system, additionally the roll and pitch become also unobservable if there is no rotation. We prove that by integrating a planar motion constraint roll and pitch become observable. We also show that the parameters of the motion model are observable.
Abstract:Differentiable physics is a powerful tool in computer vision and robotics for scene understanding and reasoning about interactions. Existing approaches have frequently been limited to objects with simple shape or shapes that are known in advance. In this paper, we propose a novel approach to differentiable physics with frictional contacts which represents object shapes implicitly using signed distance fields (SDFs). Our simulation supports contact point calculation even when the involved shapes are nonconvex. Moreover, we propose ways for differentiating the dynamics for the object shape to facilitate shape optimization using gradient-based methods. In our experiments, we demonstrate that our approach allows for model-based inference of physical parameters such as friction coefficients, mass, forces or shape parameters from trajectory and depth image observations in several challenging synthetic scenarios and a real image sequence.
Abstract:Event cameras are promising devices for lowlatency tracking and high-dynamic range imaging. In this paper,we propose a novel approach for 6 degree-of-freedom (6-DoF)object motion tracking that combines measurements of eventand frame-based cameras. We formulate tracking from highrate events with a probabilistic generative model of the eventmeasurement process of the object. On a second layer, we refinethe object trajectory in slower rate image frames through directimage alignment. We evaluate the accuracy of our approach inseveral object tracking scenarios with synthetic data, and alsoperform experiments with real data.
Abstract:In this paper, we learn dynamics models for parametrized families of dynamical systems with varying properties. The dynamics models are formulated as stochastic processes conditioned on a latent context variable which is inferred from observed transitions of the respective system. The probabilistic formulation allows us to compute an action sequence which, for a limited number of environment interactions, optimally explores the given system within the parametrized family. This is achieved by steering the system through transitions being most informative for the context variable. We demonstrate the effectiveness of our method for exploration on a non-linear toy-problem and two well-known reinforcement learning environments.
Abstract:Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.
Abstract:Dynamic scene understanding is an essential capability in robotics and VR/AR. In this paper we propose Co-Section, an optimization-based approach to 3D dynamic scene reconstruction, which infers hidden shape information from intersection constraints. An object-level dynamic SLAM frontend detects, segments, tracks and maps dynamic objects in the scene. Our optimization backend completes the shapes using hull and intersection constraints between the objects. In experiments, we demonstrate our approach on real and synthetic dynamic scene datasets. We also assess the shape completion performance of our method quantitatively. To the best of our knowledge, our approach is the first method to incorporate such physical plausibility constraints on object intersections for shape completion of dynamic objects in an energy minimization framework.
Abstract:3D scene understanding from images is a challenging problem which is encountered in robotics, augmented reality and autonomous driving scenarios. In this paper, we propose a novel approach to jointly infer the 3D rigid-body poses and shapes of vehicles from stereo images of road scenes. Unlike previous work that relies on geometric alignment of shapes with dense stereo reconstructions, our approach works directly on images and infers shape and pose efficiently through combined photometric and silhouette alignment of 3D shape priors with a stereo image. We use a shape prior that represents cars in a low-dimensional linear embedding of volumetric signed distance functions. For efficiently measuring the consistency with both alignment terms, we propose an adaptive sparse point selection scheme. In experiments, we demonstrate superior performance of our method in pose estimation and shape reconstruction over a state-of-the-art approach that uses geometric alignment with dense stereo reconstructions. Our approach can also boost the performance of deep-learning based approaches to 3D object detection as a refinement method. We demonstrate that our method significantly improves accuracy for several recent detection approaches.