Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stefan Schaal

AMD, MPI for Intelligent Systems, Tübingen, Germany, CLMC Lab, University of Southern California, Los Angeles, USA

Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Mar 03, 2017

Alonso Marco, Felix Berkenkamp, Philipp Hennig, Angela P. Schoellig, Andreas Krause, Stefan Schaal, Sebastian Trimpe

Figure 1 for Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Figure 2 for Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Figure 3 for Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Figure 4 for Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Abstract:In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.

* 7 pages, 6 figures, to appear in IEEE 2017 International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Balancing and Walking Using Full Dynamics LQR Control With Contact Constraints

Jan 27, 2017

Sean Mason, Nicholas Rotella, Stefan Schaal, Ludovic Righetti

Figure 1 for Balancing and Walking Using Full Dynamics LQR Control With Contact Constraints

Figure 2 for Balancing and Walking Using Full Dynamics LQR Control With Contact Constraints

Figure 3 for Balancing and Walking Using Full Dynamics LQR Control With Contact Constraints

Figure 4 for Balancing and Walking Using Full Dynamics LQR Control With Contact Constraints

Abstract:Torque control algorithms which consider robot dynamics and contact constraints are important for creating dynamic behaviors for humanoids. As computational power increases, algorithms tend to also increase in complexity. However, it is not clear how much complexity is really required to create controllers which exhibit good performance. In this paper, we study the capabilities of a simple approach based on contact consistent LQR controllers designed around key poses to control various tasks on a humanoid robot. We present extensive experimental results on a hydraulic, torque controlled humanoid performing balancing and stepping tasks. This feedback control approach captures the necessary synergies between the DoFs of the robot to guarantee good control performance. We show that for the considered tasks, it is only necessary to re-linearize the dynamics of the robot at different contact configurations and that increasing the number of LQR controllers along desired trajectories does not improve performance. Our result suggest that very simple controllers can yield good performance competitive with current state of the art, but more complex, optimization-based whole-body controllers. A video of the experiments can be found at https://youtu.be/5T08CNKV1hw.

Via

Access Paper or Ask Questions

Structured contact force optimization for kino-dynamic motion generation

Dec 24, 2016

Alexander Herzog, Stefan Schaal, Ludovic Righetti

Figure 1 for Structured contact force optimization for kino-dynamic motion generation

Figure 2 for Structured contact force optimization for kino-dynamic motion generation

Figure 3 for Structured contact force optimization for kino-dynamic motion generation

Figure 4 for Structured contact force optimization for kino-dynamic motion generation

Abstract:Optimal control approaches in combination with trajectory optimization have recently proven to be a promising control strategy for legged robots. Computationally efficient and robust algorithms were derived using simplified models of the contact interaction between robot and environment such as the linear inverted pendulum model (LIPM). However, as humanoid robots enter more complex environments, less restrictive models become increasingly important. As we leave the regime of linear models, we need to build dedicated solvers that can compute interaction forces together with consistent kinematic plans for the whole-body. In this paper, we address the problem of planning robot motion and interaction forces for legged robots given predefined contact surfaces. The motion generation process is decomposed into two alternating parts computing force and motion plans in coherence. We focus on the properties of the momentum computation leading to sparse optimal control formulations to be exploited by a dedicated solver. In our experiments, we demonstrate that our motion generation algorithm computes consistent contact forces and joint trajectories for our humanoid robot. We also demonstrate the favorable time complexity due to our formulation and composition of the momentum equations.

* 8 pages

Via

Access Paper or Ask Questions

A Probabilistic Representation for Dynamic Movement Primitives

Dec 18, 2016

Franziska Meier, Stefan Schaal

Figure 1 for A Probabilistic Representation for Dynamic Movement Primitives

Figure 2 for A Probabilistic Representation for Dynamic Movement Primitives

Abstract:Dynamic Movement Primitives have successfully been used to realize imitation learning, trial-and-error learning, reinforce- ment learning, movement recognition and segmentation and control. Because of this they have become a popular represen- tation for motor primitives. In this work, we showcase how DMPs can be reformulated as a probabilistic linear dynamical system with control inputs. Through this probabilistic repre- sentation of DMPs, algorithms such as Kalman filtering and smoothing are directly applicable to perform inference on pro- prioceptive sensor measurements during execution. We show that inference in this probabilistic model automatically leads to a feedback term to online modulate the execution of a DMP. Furthermore, we show how inference allows us to measure the likelihood that we are successfully executing a given motion primitive. In this context, we show initial results of using the probabilistic model to detect execution failures on a simulated movement primitive dataset.

Via

Access Paper or Ask Questions

Probabilistic Articulated Real-Time Tracking for Robot Manipulation

Nov 25, 2016

Cristina Garcia Cifuentes, Jan Issac, Manuel Wüthrich, Stefan Schaal, Jeannette Bohg

Figure 1 for Probabilistic Articulated Real-Time Tracking for Robot Manipulation

Figure 2 for Probabilistic Articulated Real-Time Tracking for Robot Manipulation

Figure 3 for Probabilistic Articulated Real-Time Tracking for Robot Manipulation

Figure 4 for Probabilistic Articulated Real-Time Tracking for Robot Manipulation

Abstract:We propose a probabilistic filtering method which fuses joint measurements with depth images to yield a precise, real-time estimate of the end-effector pose in the camera frame. This avoids the need for frame transformations when using it in combination with visual object tracking methods. Precision is achieved by modeling and correcting biases in the joint measurements as well as inaccuracies in the robot model, such as poor extrinsic camera calibration. We make our method computationally efficient through a principled combination of Kalman filtering of the joint measurements and asynchronous depth-image updates based on the Coordinate Particle Filter. We quantitatively evaluate our approach on a dataset recorded from a real robotic platform, annotated with ground truth from a motion capture system. We show that our approach is robust and accurate even under challenging conditions such as fast motion, significant and long-term occlusions, and time-varying biases. We release the dataset along with open-source code of our approach to allow for quantitative comparison with alternative approaches.

* 8 pages, 7 figures. Revision submitted to IEEE Robotics and Automation Letters (RA-L). Fixed wrong order of bars in boxplots; further argumentation

Via

Access Paper or Ask Questions

Stepping Stabilization Using a Combination of DCM Tracking and Step Adjustment

Sep 30, 2016

Majid Khadiv, Sebastien Kleff, Alexander Herzog, S. Ali. A. Moosavian, Stefan Schaal, Ludovic Righetti

Figure 1 for Stepping Stabilization Using a Combination of DCM Tracking and Step Adjustment

Figure 2 for Stepping Stabilization Using a Combination of DCM Tracking and Step Adjustment

Figure 3 for Stepping Stabilization Using a Combination of DCM Tracking and Step Adjustment

Figure 4 for Stepping Stabilization Using a Combination of DCM Tracking and Step Adjustment

Abstract:In this paper, a method for stabilizing biped robots stepping by a combination of Divergent Component of Motion (DCM) tracking and step adjustment is proposed. In this method, the DCM trajectory is generated, consistent with the predefined footprints. Furthermore, a swing foot trajectory modification strategy is proposed to adapt the landing point, using DCM measurement. In order to apply the generated trajectories to the full robot, a Hierarchical Inverse Dynamics (HID) is employed. The HID enables us to use different combinations of the DCM tracking and step adjustment for stabilizing different biped robots. Simulation experiments on two scenarios for two different simulated robots, one with active ankles and the other with passive ankles, are carried out. Simulation results demonstrate the effectiveness of the proposed method for robots with both active and passive ankles.

* 6 pages, 5 figure

Via

Access Paper or Ask Questions

DOOMED: Direct Online Optimization of Modeling Errors in Dynamics

Aug 09, 2016

Nathan Ratliff, Franziska Meier, Daniel Kappler, Stefan Schaal

Figure 1 for DOOMED: Direct Online Optimization of Modeling Errors in Dynamics

Figure 2 for DOOMED: Direct Online Optimization of Modeling Errors in Dynamics

Figure 3 for DOOMED: Direct Online Optimization of Modeling Errors in Dynamics

Figure 4 for DOOMED: Direct Online Optimization of Modeling Errors in Dynamics

Abstract:It has long been hoped that model-based control will improve tracking performance while maintaining or increasing compliance. This hope hinges on having or being able to estimate an accurate inverse dynamics model. As a result, substantial effort has gone into modeling and estimating dynamics (error) models. Most recent research has focused on learning the true inverse dynamics using data points mapping observed accelerations to the torques used to generate them. Unfortunately, if the initial tracking error is bad, such learning processes may train substantially off-distribution to predict well on actual observed acceleration rather then the desired accelerations. This work takes a different approach. We define a class of gradient-based online learning algorithms we term Direct Online Optimization for Modeling Errors in Dynamics (DOOMED) that directly minimize an objective measuring the divergence between actual and desired accelerations. Our objective is defined in terms of the true system's unknown dynamics and is therefore impossible to evaluate. However, we show that its gradient is measurable online from system data. We develop a novel adaptive control approach based on running online learning to directly correct (inverse) dynamics errors in real time using the data stream from the robot to accurately achieve desired accelerations during execution.

* Added an acknowledgements section

Via

Access Paper or Ask Questions

A Convex Model of Momentum Dynamics for Multi-Contact Motion Generation

Jul 28, 2016

Brahayam Ponton, Alexander Herzog, Stefan Schaal, Ludovic Righetti

Figure 1 for A Convex Model of Momentum Dynamics for Multi-Contact Motion Generation

Figure 2 for A Convex Model of Momentum Dynamics for Multi-Contact Motion Generation

Figure 3 for A Convex Model of Momentum Dynamics for Multi-Contact Motion Generation

Figure 4 for A Convex Model of Momentum Dynamics for Multi-Contact Motion Generation

Abstract:Linear models for control and motion generation of humanoid robots have received significant attention in the past years, not only due to their well known theoretical guarantees, but also because of practical computational advantages. However, to tackle more challenging tasks and scenarios such as locomotion on uneven terrain, a more expressive model is required. In this paper, we are interested in contact interaction-centered motion optimization based on the momentum dynamics model. This model is non-linear and non-convex; however, we find a relaxation of the problem that allows us to formulate it as a single convex quadratically-constrained quadratic program (QCQP) that can be very efficiently optimized. Furthermore, experimental results suggest that this relaxation is tight and therefore useful for multi-contact planning. This convex model is then coupled to the optimization of end-effector contacts location using a mixed integer program, which can be solved in realtime. This becomes relevant e.g. to recover from external pushes, where a predefined stepping plan is likely to fail and an online adaptation of the contact location is needed. The performance of our algorithm is demonstrated in several multi-contact scenarios for humanoid robot.

* 8 pages

Via

Access Paper or Ask Questions

On the Fundamental Importance of Gauss-Newton in Motion Optimization

May 30, 2016

Nathan Ratliff, Marc Toussaint, Jeannette Bohg, Stefan Schaal

Figure 1 for On the Fundamental Importance of Gauss-Newton in Motion Optimization

Figure 2 for On the Fundamental Importance of Gauss-Newton in Motion Optimization

Abstract:Hessian information speeds convergence substantially in motion optimization. The better the Hessian approximation the better the convergence. But how good is a given approximation theoretically? How much are we losing? This paper addresses that question and proves that for a particularly popular and empirically strong approximation known as the Gauss-Newton approximation, we actually lose very little--for a large class of highly expressive objective terms, the true Hessian actually limits to the Gauss-Newton Hessian quickly as the trajectory's time discretization becomes small. This result both motivates it's use and offers insight into computationally efficient design. For instance, traditional representations of kinetic energy exploit the generalized inertia matrix whose derivatives are usually difficult to compute. We introduce here a novel reformulation of rigid body kinetic energy designed explicitly for fast and accurate curvature calculation. Our theorem proves that the Gauss-Newton Hessian under this formulation efficiently captures the kinetic energy curvature, but requires only as much computation as a single evaluation of the traditional representation. Additionally, we introduce a technique that exploits these ideas implicitly using Cholesky decompositions for some cases when similar objective terms reformulations exist but may be difficult to find. Our experiments validate these findings and demonstrate their use on a real-world motion optimization system for high-dof motion generation.

Via

Access Paper or Ask Questions

Robust Gaussian Filtering using a Pseudo Measurement

May 30, 2016

Manuel Wüthrich, Cristina Garcia Cifuentes, Sebastian Trimpe, Franziska Meier, Jeannette Bohg, Jan Issac, Stefan Schaal

Figure 1 for Robust Gaussian Filtering using a Pseudo Measurement

Figure 2 for Robust Gaussian Filtering using a Pseudo Measurement

Figure 3 for Robust Gaussian Filtering using a Pseudo Measurement

Figure 4 for Robust Gaussian Filtering using a Pseudo Measurement

Abstract:Many sensors, such as range, sonar, radar, GPS and visual devices, produce measurements which are contaminated by outliers. This problem can be addressed by using fat-tailed sensor models, which account for the possibility of outliers. Unfortunately, all estimation algorithms belonging to the family of Gaussian filters (such as the widely-used extended Kalman filter and unscented Kalman filter) are inherently incompatible with such fat-tailed sensor models. The contribution of this paper is to show that any Gaussian filter can be made compatible with fat-tailed sensor models by applying one simple change: Instead of filtering with the physical measurement, we propose to filter with a pseudo measurement obtained by applying a feature function to the physical measurement. We derive such a feature function which is optimal under some conditions. Simulation results show that the proposed method can effectively handle measurement outliers and allows for robust filtering in both linear and nonlinear systems.

Via

Access Paper or Ask Questions