Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexey Dosovitskiy

Learning agile and dynamic motor skills for legged robots

Jan 24, 2019
Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, Marco Hutter

Legged robots pose one of the greatest challenges in robotics. Dynamic and agile maneuvers of animals cannot be imitated by existing methods that are crafted by humans. A compelling alternative is reinforcement learning, which requires minimal craftsmanship and promotes the natural evolution of a control policy. However, so far, reinforcement learning research for legged robots is mainly limited to simulation, and only few and comparably simple examples have been deployed on real systems. The primary reason is that training with real robots, particularly with dynamically balancing systems, is complicated and expensive. In the present work, we introduce a method for training a neural network policy in simulation and transferring it to a state-of-the-art legged system, thereby leveraging fast, automated, and cost-effective data generation schemes. The approach is applied to the ANYmal robot, a sophisticated medium-dog-sized quadrupedal system. Using policies trained in simulation, the quadrupedal machine achieves locomotion skills that go beyond what had been achieved with prior methods: ANYmal is capable of precisely and energy-efficiently following high-level body velocity commands, running faster than before, and recovering from falling even in complex configurations.

* Science Robotics 4.26 (2019): eaau5872

Via

Access Paper or Ask Questions

Motion Perception in Reinforcement Learning with Dynamic Objects

Jan 10, 2019
Artemij Amiranashvili, Alexey Dosovitskiy, Vladlen Koltun, Thomas Brox

Figure 1 for Motion Perception in Reinforcement Learning with Dynamic Objects

Figure 2 for Motion Perception in Reinforcement Learning with Dynamic Objects

Figure 3 for Motion Perception in Reinforcement Learning with Dynamic Objects

Figure 4 for Motion Perception in Reinforcement Learning with Dynamic Objects

In dynamic environments, learned controllers are supposed to take motion into account when selecting the action to be taken. However, in existing reinforcement learning works motion is rarely treated explicitly; it is rather assumed that the controller learns the necessary motion representation from temporal stacks of frames implicitly. In this paper, we show that for continuous control tasks learning an explicit representation of motion improves the quality of the learned controller in dynamic scenarios. We demonstrate this on common benchmark tasks (Walker, Swimmer, Hopper), on target reaching and ball catching tasks with simulated robotic arms, and on a dynamic single ball juggling task. Moreover, we find that when equipped with an appropriate network architecture, the agent can, on some tasks, learn motion features also with pure reinforcement learning, without additional supervision. Further we find that using an image difference between the current and the previous frame as an additional input leads to better results than a temporal stack of frames.

Via

Access Paper or Ask Questions

Driving Policy Transfer via Modularity and Abstraction

Dec 13, 2018
Matthias Müller, Alexey Dosovitskiy, Bernard Ghanem, Vladlen Koltun

Figure 1 for Driving Policy Transfer via Modularity and Abstraction

Figure 2 for Driving Policy Transfer via Modularity and Abstraction

Figure 3 for Driving Policy Transfer via Modularity and Abstraction

Figure 4 for Driving Policy Transfer via Modularity and Abstraction

End-to-end approaches to autonomous driving have high sample complexity and are difficult to scale to realistic urban driving. Simulation can help end-to-end driving systems by providing a cheap, safe, and diverse training environment. Yet training driving policies in simulation brings up the problem of transferring such policies to the real world. We present an approach to transferring driving policies from simulation to reality via modularity and abstraction. Our approach is inspired by classic driving systems and aims to combine the benefits of modular architectures and end-to-end deep learning approaches. The key idea is to encapsulate the driving policy such that it is not directly exposed to raw perceptual input or low-level vehicle dynamics. We evaluate the presented approach in simulated urban environments and in the real world. In particular, we transfer a driving policy trained in simulation to a 1/5-scale robotic truck that is deployed in a variety of conditions, with no finetuning, on two continents. The supplementary video can be viewed at https://youtu.be/BrMDJqI6H5U

* Accepted at Conference on Robotic Learning (CoRL'18) http://proceedings.mlr.press/v87/mueller18a.html

Via

Access Paper or Ask Questions

Unsupervised Learning of Shape and Pose with Differentiable Point Clouds

Oct 22, 2018
Eldar Insafutdinov, Alexey Dosovitskiy

Figure 1 for Unsupervised Learning of Shape and Pose with Differentiable Point Clouds

Figure 2 for Unsupervised Learning of Shape and Pose with Differentiable Point Clouds

Figure 3 for Unsupervised Learning of Shape and Pose with Differentiable Point Clouds

Figure 4 for Unsupervised Learning of Shape and Pose with Differentiable Point Clouds

We address the problem of learning accurate 3D shape and camera pose from a collection of unlabeled category-specific images. We train a convolutional network to predict both the shape and the pose from a single image by minimizing the reprojection error: given several views of an object, the projections of the predicted shapes to the predicted camera poses should match the provided views. To deal with pose ambiguity, we introduce an ensemble of pose predictors which we then distill to a single "student" model. To allow for efficient learning of high-fidelity shapes, we represent the shapes by point clouds and devise a formulation allowing for differentiable projection of these. Our experiments show that the distilled ensemble of pose predictors learns to estimate the pose accurately, while the point cloud representation allows to predict detailed shape models. The supplementary video can be found at https://www.youtube.com/watch?v=LuIGovKeo60

Via

Access Paper or Ask Questions

Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Oct 09, 2018
Elia Kaufmann, Antonio Loquercio, Rene Ranftl, Alexey Dosovitskiy, Vladlen Koltun, Davide Scaramuzza

Figure 1 for Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Figure 2 for Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Figure 3 for Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Figure 4 for Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Autonomous agile flight brings up fundamental challenges in robotics, such as coping with unreliable state estimation, reacting optimally to dynamically changing environments, and coupling perception and action in real time under severe resource constraints. In this paper, we consider these challenges in the context of autonomous, vision-based drone racing in dynamic environments. Our approach combines a convolutional neural network (CNN) with a state-of-the-art path-planning and control system. The CNN directly maps raw images into a robust representation in the form of a waypoint and desired speed. This information is then used by the planner to generate a short, minimum-jerk trajectory segment and corresponding motor commands to reach the desired goal. We demonstrate our method in autonomous agile flight scenarios, in which a vision-based quadrotor traverses drone-racing tracks with possibly moving gates. Our method does not require any explicit map of the environment and runs fully onboard. We extensively test the precision and robustness of the approach in simulation and in the physical world. We also evaluate our method against state-of-the-art navigation approaches and professional human drone pilots.

* Conference on Robotic Learning (CoRL), 2018
* Accepted for publication in the Conference on Robotic Learning (CoRL) 2018, Zurich. 10 pages (+3 supplementary)

Via

Access Paper or Ask Questions

On Offline Evaluation of Vision-based Driving Models

Sep 13, 2018
Felipe Codevilla, Antonio M. López, Vladlen Koltun, Alexey Dosovitskiy

Figure 1 for On Offline Evaluation of Vision-based Driving Models

Figure 2 for On Offline Evaluation of Vision-based Driving Models

Figure 3 for On Offline Evaluation of Vision-based Driving Models

Figure 4 for On Offline Evaluation of Vision-based Driving Models

Autonomous driving models should ideally be evaluated by deploying them on a fleet of physical vehicles in the real world. Unfortunately, this approach is not practical for the vast majority of researchers. An attractive alternative is to evaluate models offline, on a pre-collected validation dataset with ground truth annotation. In this paper, we investigate the relation between various online and offline metrics for evaluation of autonomous driving models. We find that offline prediction error is not necessarily correlated with driving quality, and two models with identical prediction error can differ dramatically in their driving performance. We show that the correlation of offline evaluation with driving quality can be significantly improved by selecting an appropriate validation dataset and suitable offline metrics. The supplementary video can be viewed at https://www.youtube.com/watch?v=P8K8Z-iF0cY

* Published at the ECCV 2018 conference

Via

Access Paper or Ask Questions

Artistic style transfer for videos and spherical images

Aug 05, 2018
Manuel Ruder, Alexey Dosovitskiy, Thomas Brox

Figure 1 for Artistic style transfer for videos and spherical images

Figure 2 for Artistic style transfer for videos and spherical images

Figure 3 for Artistic style transfer for videos and spherical images

Figure 4 for Artistic style transfer for videos and spherical images

Manually re-drawing an image in a certain artistic style takes a professional artist a long time. Doing this for a video sequence single-handedly is beyond imagination. We present two computational approaches that transfer the style from one image (for example, a painting) to a whole video sequence. In our first approach, we adapt to videos the original image style transfer technique by Gatys et al. based on energy minimization. We introduce new ways of initialization and new loss functions to generate consistent and stable stylized video sequences even in cases with large motion and strong occlusion. Our second approach formulates video stylization as a learning problem. We propose a deep network architecture and training procedures that allow us to stylize arbitrary-length videos in a consistent and stable way, and nearly in real time. We show that the proposed methods clearly outperform simpler baselines both qualitatively and quantitatively. Finally, we propose a way to adapt these approaches also to 360 degree images and videos as they emerge with recent virtual reality hardware.

* v3: added ref to conference. This paper is a successor of and overlaps with arXiv:1604.08610, International Journal of Computer Vision (IJCV), 2018

Via

Access Paper or Ask Questions

On Evaluation of Embodied Navigation Agents

Jul 18, 2018
Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva, Amir R. Zamir

Skillful mobile operation in three-dimensional environments is a primary topic of study in Artificial Intelligence. The past two years have seen a surge of creative work on navigation. This creative output has produced a plethora of sometimes incompatible task definitions and evaluation protocols. To coordinate ongoing and future research in this area, we have convened a working group to study empirical methodology in navigation research. The present document summarizes the consensus recommendations of this working group. We discuss different problem statements and the role of generalization, present evaluation measures, and provide standard scenarios that can be used for benchmarking.

* Report of a working group on empirical methodology in navigation research. Authors are listed in alphabetical order

Via

Access Paper or Ask Questions

TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Jun 04, 2018
Artemij Amiranashvili, Alexey Dosovitskiy, Vladlen Koltun, Thomas Brox

Figure 1 for TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Figure 2 for TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Figure 3 for TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Figure 4 for TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Our understanding of reinforcement learning (RL) has been shaped by theoretical and empirical results that were obtained decades ago using tabular representations and linear function approximators. These results suggest that RL methods that use temporal differencing (TD) are superior to direct Monte Carlo estimation (MC). How do these results hold up in deep RL, which deals with perceptually complex environments and deep nonlinear models? In this paper, we re-examine the role of TD in modern deep RL, using specially designed environments that control for specific factors that affect performance, such as reward sparsity, reward delay, and the perceptual complexity of the task. When comparing TD with infinite-horizon MC, we are able to reproduce classic results in modern settings. Yet we also find that finite-horizon MC is not inferior to TD, even when rewards are sparse or delayed. This makes MC a viable alternative to TD in deep RL.

Via

Access Paper or Ask Questions