The rise of deep learning has caused a paradigm shift in robotics research, favoring methods that require large amounts of data. It is prohibitively expensive to generate such data sets on a physical platform. Therefore, state-of-the-art approaches learn in simulation where data generation is fast as well as inexpensive and subsequently transfer the knowledge to the real robot (sim-to-real). Despite becoming increasingly realistic, all simulators are by construction based on models, hence inevitably imperfect. This raises the question of how simulators can be modified to facilitate learning robot control policies and overcome the mismatch between simulation and reality, often called the 'reality gap'. We provide a comprehensive review of sim-to-real research for robotics, focusing on a technique named 'domain randomization' which is a method for learning from randomized simulations.
In an environment where a manipulator needs to execute multiple pick-and-place tasks, the act of object manoeuvre will change the underlying configuration space, which in turn affects all subsequent tasks. Previously free configurations might now be occupied by the newly placed objects, and previously occupied space might now open up new paths. We propose Lazy Tree-based Replanner (LTR*) -- a novel hybrid planner that inherits the rapid planning nature of existing anytime incremental sampling-based planners, and at the same time allows subsequent tasks to leverage prior experience via a lazy experience graph. Previous experience is summarised in a lazy graph structure, and LTR* is formulated such that it is robust and beneficial regardless of the extent of changes in the workspace. Our hybrid approach attains a faster speed in obtaining an initial solution than existing roadmap-based planners and often with a lower cost in trajectory length. Subsequent tasks can utilise the lazy experience graph to speed up finding a solution and take advantage of the optimised graph to minimise the cost objective. We provide rigorous proofs of probabilistic completeness and asymptotic optimal guarantees. Experimentally, we show that in repeated pick-and-place tasks, LTR* attains a high gain in performance when planning for subsequent tasks.
To accurately reproduce measurements from the real world, simulators need to have an adequate model of the physical system and require the parameters of the model be identified. We address the latter problem of estimating parameters through a Bayesian inference approach that approximates a posterior distribution over simulation parameters given real sensor measurements. By extending the commonly used Gaussian likelihood model for trajectories via the multiple-shooting formulation, our chosen particle-based inference algorithm Stein Variational Gradient Descent is able to identify highly nonlinear, underactuated systems. We leverage GPU code generation and differentiable simulation to evaluate the likelihood and its gradient for many particles in parallel. Our algorithm infers non-parametric distributions over simulation parameters more accurately than comparable baselines and handles constraints over parameters efficiently through gradient-based optimization. We evaluate estimation performance on several physical experiments. On an underactuated mechanism where a 7-DOF robot arm excites an object with an unknown mass configuration, we demonstrate how our inference technique can identify symmetries between the parameters and provide highly accurate predictions. Project website: https://uscresl.github.io/prob-diff-sim
We propose Parallelised Diffeomorphic Sampling-based Motion Planning (PDMP). PDMP is a novel parallelised framework that uses bijective and differentiable mappings, or diffeomorphisms, to transform sampling distributions of sampling-based motion planners, in a manner akin to normalising flows. Unlike normalising flow models which use invertible neural network structures to represent these diffeomorphisms, we develop them from gradient information of desired costs, and encode desirable behaviour, such as obstacle avoidance. These transformed sampling distributions can then be used for sampling-based motion planning. A particular example is when we wish to imbue the sampling distribution with knowledge of the environment geometry, such that drawn samples are less prone to be in collisions. To this end, we propose to learn a continuous occupancy representation from environment occupancy data, such that gradients of the representation defines a valid diffeomorphism and is amenable to fast parallel evaluation. We use this to "morph" the sampling distribution to draw far fewer collision-prone samples. PDMP is able to leverage gradient information of costs, to inject specifications, in a manner similar to optimisation-based motion planning methods, but relies on drawing from a sampling distribution, retaining the tendency to find more global solutions, thereby bridging the gap between trajectory optimisation and sampling-based planning methods.
BayesSim is a statistical technique for domain randomization in reinforcement learning based on likelihood-free inference of simulation parameters. This paper outlines BayesSimIG: a library that provides an implementation of BayesSim integrated with the recently released NVIDIA IsaacGym. This combination allows large-scale parameter inference with end-to-end GPU acceleration. Both inference and simulation get GPU speedup, with support for running more than 10K parallel simulation environments for complex robotics tasks that can have more than 100 simulation parameters to estimate. BayesSimIG provides an integration with TensorBoard to easily visualize slices of high-dimensional posteriors. The library is built in a modular way to support research experiments with novel ways to collect and process the trajectories from the parallel IsaacGym environments.
This work addresses the problem of predicting the motion trajectories of dynamic objects in the environment. Recent advances in predicting motion patterns often rely on machine learning techniques to extrapolate motion patterns from observed trajectories, with no mechanism to directly incorporate known rules. We propose a novel framework, which combines probabilistic learning and constrained trajectory optimisation. The learning component of our framework provides a distribution over future motion trajectories conditioned on observed past coordinates. This distribution is then used as a prior to a constrained optimisation problem which enforces chance constraints on the trajectory distribution. This results in constraint-compliant trajectory distributions which closely resemble the prior. In particular, we focus our investigation on collision constraints, such that extrapolated future trajectory distributions conform to the environment structure. We empirically demonstrate on real-world and simulated datasets the ability of our framework to learn complex probabilistic motion trajectories for motion data, while directly enforcing constraints to improve generalisability, producing more robust and higher quality trajectory distributions.
Advances in differentiable numerical integrators have enabled the use of gradient descent techniques to learn ordinary differential equations (ODEs). In the context of machine learning, differentiable solvers are central for Neural ODEs (NODEs), a class of deep learning models with continuous depth, rather than discrete layers. However, these integrators can be unsatisfactorily slow and inaccurate when learning systems of ODEs from long sequences, or when solutions of the system vary at widely different timescales in each dimension. In this paper we propose an alternative approach to learning ODEs from data: we represent the underlying ODE as a vector field that is related to another base vector field by a differentiable bijection, modelled by an invertible neural network. By restricting the base ODE to be amenable to integration, we can drastically speed up and improve the robustness of integration. We demonstrate the efficacy of our method in training and evaluating continuous neural networks models, as well as in learning benchmark ODE systems. We observe improvements of up to two orders of magnitude when integrating learned ODEs with GPUs computation.
Quantification of uncertainty in point cloud matching is critical in many tasks such as pose estimation, sensor fusion, and grasping. Iterative closest point (ICP) is a commonly used pose estimation algorithm which provides a point estimate of the transformation between two point clouds. There are many sources of uncertainty in this process that may arise due to sensor noise, ambiguous environment, and occlusion. However, for safety critical problems such as autonomous driving, a point estimate of the pose transformation is not sufficient as it does not provide information about the multiple solutions. Current probabilistic ICP methods usually do not capture all sources of uncertainty and may provide unreliable transformation estimates which can have a detrimental effect in state estimation or decision making tasks that use this information. In this work we propose a new algorithm to align two point clouds that can precisely estimate the uncertainty of ICP's transformation parameters. We develop a Stein variational inference framework with gradient based optimization of ICP's cost function. The method provides a non-parametric estimate of the transformation, can model complex multi-modal distributions, and can be effectively parallelized on a GPU. Experiments using 3D kinect data as well as sparse indoor/outdoor LiDAR data show that our method is capable of efficiently producing accurate pose uncertainty estimates.
Robotic cutting of soft materials is critical for applications such as food processing, household automation, and surgical manipulation. As in other areas of robotics, simulators can facilitate controller verification, policy learning, and dataset generation. Moreover, differentiable simulators can enable gradient-based optimization, which is invaluable for calibrating simulation parameters and optimizing controllers. In this work, we present DiSECt: the first differentiable simulator for cutting soft materials. The simulator augments the finite element method (FEM) with a continuous contact model based on signed distance fields (SDF), as well as a continuous damage model that inserts springs on opposite sides of the cutting plane and allows them to weaken until zero stiffness, enabling crack formation. Through various experiments, we evaluate the performance of the simulator. We first show that the simulator can be calibrated to match resultant forces and deformation fields from a state-of-the-art commercial solver and real-world cutting datasets, with generality across cutting velocities and object instances. We then show that Bayesian inference can be performed efficiently by leveraging the differentiability of the simulator, estimating posteriors over hundreds of parameters in a fraction of the time of derivative-free methods. Finally, we illustrate that control parameters in the simulation can be optimized to minimize cutting forces via lateral slicing motions. We publish videos and additional results on our project website at https://diff-cutting-sim.github.io.
Sampling-based model predictive control (MPC) is a promising tool for feedback control of robots with complex and non-smooth dynamics and cost functions. The computationally demanding nature of sampling-based MPC algorithms is a key bottleneck in their application to high-dimensional robotic manipulation problems. Previous methods have addressed this issue by running MPC in the task space while relying on a low-level operational space controller for joint control. However, by not using the joint space of the robot in the MPC formulation, existing methods cannot directly account for non-task space related constraints such as avoiding joint limits, singular configurations, and link collisions. In this paper, we develop a joint space sampling-based MPC for manipulators that can be efficiently parallelized using GPUs. Our approach can handle task and joint space constraints while taking less than 0.02 seconds (50Hz) to compute the next control command. Further, our method can integrate perception into the control problem by utilizing learned cost functions from raw sensor data. We validate our approach by deploying it on a Franka Panda robot for a variety of common manipulation tasks. We study the effect of different cost formulations and MPC parameters on the synthesized behavior and provide key insights that pave the way for the application of sampling-based MPC for manipulators in a principled manner. Videos of experiments can be found at: https://sites.google.com/view/manipulation-mppi.