Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vladlen Koltun

Stanford University

Assessing Generalization in Deep Reinforcement Learning

Oct 29, 2018

Charles Packer, Katelyn Gao, Jernej Kos, Philipp Krähenbühl, Vladlen Koltun, Dawn Song

Figure 1 for Assessing Generalization in Deep Reinforcement Learning

Figure 2 for Assessing Generalization in Deep Reinforcement Learning

Figure 3 for Assessing Generalization in Deep Reinforcement Learning

Figure 4 for Assessing Generalization in Deep Reinforcement Learning

Abstract:Deep reinforcement learning (RL) has achieved breakthrough results on many tasks, but has been shown to be sensitive to system changes at test time. As a result, building deep RL agents that generalize has become an active research area. Our aim is to catalyze and streamline community-wide progress on this problem by providing the first benchmark and a common experimental protocol for investigating generalization in RL. Our benchmark contains a diverse set of environments and our evaluation methodology covers both in-distribution and out-of-distribution generalization. To provide a set of baselines for future research, we conduct a systematic evaluation of deep RL algorithms, including those that specifically tackle the problem of generalization.

* 18 pages, 6 figures

Via

Access Paper or Ask Questions

Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search

Oct 25, 2018

Zhuwen Li, Qifeng Chen, Vladlen Koltun

Figure 1 for Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search

Figure 2 for Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search

Figure 3 for Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search

Figure 4 for Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search

Abstract:We present a learning-based approach to computing solutions for certain NP-hard problems. Our approach combines deep learning techniques with useful algorithmic elements from classic heuristics. The central component is a graph convolutional network that is trained to estimate the likelihood, for each vertex in a graph, of whether this vertex is part of the optimal solution. The network is designed and trained to synthesize a diverse set of solutions, which enables rapid exploration of the solution space via tree search. The presented approach is evaluated on four canonical NP-hard problems and five datasets, which include benchmark satisfiability problems and real social network graphs with up to a hundred thousand nodes. Experimental results demonstrate that the presented approach substantially outperforms recent deep learning work, and performs on par with highly optimized state-of-the-art heuristic solvers for some NP-hard problems. Experiments indicate that our approach generalizes across datasets, and scales to graphs that are orders of magnitude larger than those used during training.

* To appear in NIPS 2018

Via

Access Paper or Ask Questions

Trellis Networks for Sequence Modeling

Oct 15, 2018

Shaojie Bai, J. Zico Kolter, Vladlen Koltun

Figure 1 for Trellis Networks for Sequence Modeling

Figure 2 for Trellis Networks for Sequence Modeling

Figure 3 for Trellis Networks for Sequence Modeling

Figure 4 for Trellis Networks for Sequence Modeling

Abstract:We present trellis networks, a new architecture for sequence modeling. On the one hand, a trellis network is a temporal convolutional network with special structure, characterized by weight tying across depth and direct injection of the input into deep layers. On the other hand, we show that truncated recurrent networks are equivalent to trellis networks with special sparsity structure in their weight matrices. Thus trellis networks with general weight matrices generalize truncated recurrent networks. We leverage these connections to design high-performing trellis networks that absorb structural and algorithmic elements from both recurrent and convolutional models. Experiments demonstrate that trellis networks outperform the current state of the art on a variety of challenging benchmarks, including word-level language modeling on Penn Treebank and WikiText-103, character-level language modeling on Penn Treebank, and stress tests designed to evaluate long-term memory retention. The code is available at https://github.com/locuslab/trellisnet .

Via

Access Paper or Ask Questions

Multi-Task Learning as Multi-Objective Optimization

Oct 10, 2018

Ozan Sener, Vladlen Koltun

Figure 1 for Multi-Task Learning as Multi-Objective Optimization

Figure 2 for Multi-Task Learning as Multi-Objective Optimization

Figure 3 for Multi-Task Learning as Multi-Objective Optimization

Figure 4 for Multi-Task Learning as Multi-Objective Optimization

Abstract:In multi-task learning, multiple tasks are solved jointly, sharing inductive bias between them. Multi-task learning is inherently a multi-objective problem because different tasks may conflict, necessitating a trade-off. A common compromise is to optimize a proxy objective that minimizes a weighted linear combination of per-task losses. However, this workaround is only valid when the tasks do not compete, which is rarely the case. In this paper, we explicitly cast multi-task learning as multi-objective optimization, with the overall objective of finding a Pareto optimal solution. To this end, we use algorithms developed in the gradient-based multi-objective optimization literature. These algorithms are not directly applicable to large-scale learning problems since they scale poorly with the dimensionality of the gradients and the number of tasks. We therefore propose an upper bound for the multi-objective loss and show that it can be optimized efficiently. We further prove that optimizing this upper bound yields a Pareto optimal solution under realistic assumptions. We apply our method to a variety of multi-task deep learning problems including digit classification, scene understanding (joint semantic segmentation, instance segmentation, and depth estimation), and multi-label classification. Our method produces higher-performing models than recent multi-task learning formulations or per-task training.

* To appear in NIPS 2018

Via

Access Paper or Ask Questions

Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Oct 09, 2018

Elia Kaufmann, Antonio Loquercio, Rene Ranftl, Alexey Dosovitskiy, Vladlen Koltun, Davide Scaramuzza

Figure 1 for Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Figure 2 for Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Figure 3 for Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Figure 4 for Deep Drone Racing: Learning Agile Flight in Dynamic Environments

Abstract:Autonomous agile flight brings up fundamental challenges in robotics, such as coping with unreliable state estimation, reacting optimally to dynamically changing environments, and coupling perception and action in real time under severe resource constraints. In this paper, we consider these challenges in the context of autonomous, vision-based drone racing in dynamic environments. Our approach combines a convolutional neural network (CNN) with a state-of-the-art path-planning and control system. The CNN directly maps raw images into a robust representation in the form of a waypoint and desired speed. This information is then used by the planner to generate a short, minimum-jerk trajectory segment and corresponding motor commands to reach the desired goal. We demonstrate our method in autonomous agile flight scenarios, in which a vision-based quadrotor traverses drone-racing tracks with possibly moving gates. Our method does not require any explicit map of the environment and runs fully onboard. We extensively test the precision and robustness of the approach in simulation and in the physical world. We also evaluate our method against state-of-the-art navigation approaches and professional human drone pilots.

* Conference on Robotic Learning (CoRL), 2018
* Accepted for publication in the Conference on Robotic Learning (CoRL) 2018, Zurich. 10 pages (+3 supplementary)

Via

Access Paper or Ask Questions

On Offline Evaluation of Vision-based Driving Models

Sep 13, 2018

Felipe Codevilla, Antonio M. López, Vladlen Koltun, Alexey Dosovitskiy

Figure 1 for On Offline Evaluation of Vision-based Driving Models

Figure 2 for On Offline Evaluation of Vision-based Driving Models

Figure 3 for On Offline Evaluation of Vision-based Driving Models

Figure 4 for On Offline Evaluation of Vision-based Driving Models

Abstract:Autonomous driving models should ideally be evaluated by deploying them on a fleet of physical vehicles in the real world. Unfortunately, this approach is not practical for the vast majority of researchers. An attractive alternative is to evaluate models offline, on a pre-collected validation dataset with ground truth annotation. In this paper, we investigate the relation between various online and offline metrics for evaluation of autonomous driving models. We find that offline prediction error is not necessarily correlated with driving quality, and two models with identical prediction error can differ dramatically in their driving performance. We show that the correlation of offline evaluation with driving quality can be significantly improved by selecting an appropriate validation dataset and suitable offline metrics. The supplementary video can be viewed at https://www.youtube.com/watch?v=P8K8Z-iF0cY

* Published at the ECCV 2018 conference

Via

Access Paper or Ask Questions

On Evaluation of Embodied Navigation Agents

Jul 18, 2018

Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva(+1 more)

Abstract:Skillful mobile operation in three-dimensional environments is a primary topic of study in Artificial Intelligence. The past two years have seen a surge of creative work on navigation. This creative output has produced a plethora of sometimes incompatible task definitions and evaluation protocols. To coordinate ongoing and future research in this area, we have convened a working group to study empirical methodology in navigation research. The present document summarizes the consensus recommendations of this working group. We discuss different problem statements and the role of generalization, present evaluation measures, and provide standard scenarios that can be used for benchmarking.

* Report of a working group on empirical methodology in navigation research. Authors are listed in alphabetical order

Via

Access Paper or Ask Questions

Tangent Convolutions for Dense Prediction in 3D

Jul 06, 2018

Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou

Figure 1 for Tangent Convolutions for Dense Prediction in 3D

Figure 2 for Tangent Convolutions for Dense Prediction in 3D

Figure 3 for Tangent Convolutions for Dense Prediction in 3D

Figure 4 for Tangent Convolutions for Dense Prediction in 3D

Abstract:We present an approach to semantic scene analysis using deep convolutional networks. Our approach is based on tangent convolutions - a new construction for convolutional networks on 3D data. In contrast to volumetric approaches, our method operates directly on surface geometry. Crucially, the construction is applicable to unstructured point clouds and other noisy real-world data. We show that tangent convolutions can be evaluated efficiently on large-scale point clouds with millions of points. Using tangent convolutions, we design a deep fully-convolutional network for semantic segmentation of 3D point clouds, and apply it to challenging real-world datasets of indoor and outdoor 3D environments. Experimental results show that the presented approach outperforms other recent deep network constructions in detailed analysis of large 3D scenes.

Via

Access Paper or Ask Questions

TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Jun 04, 2018

Artemij Amiranashvili, Alexey Dosovitskiy, Vladlen Koltun, Thomas Brox

Figure 1 for TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Figure 2 for TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Figure 3 for TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Figure 4 for TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Abstract:Our understanding of reinforcement learning (RL) has been shaped by theoretical and empirical results that were obtained decades ago using tabular representations and linear function approximators. These results suggest that RL methods that use temporal differencing (TD) are superior to direct Monte Carlo estimation (MC). How do these results hold up in deep RL, which deals with perceptually complex environments and deep nonlinear models? In this paper, we re-examine the role of TD in modern deep RL, using specially designed environments that control for specific factors that affect performance, such as reward sparsity, reward delay, and the perceptual complexity of the task. When comparing TD with infinite-horizon MC, we are able to reproduce classic results in modern settings. Yet we also find that finite-horizon MC is not inferior to TD, even when rewards are sparse or delayed. This makes MC a viable alternative to TD in deep RL.

Via

Access Paper or Ask Questions

Learning to See in the Dark

May 04, 2018

Chen Chen, Qifeng Chen, Jia Xu, Vladlen Koltun

Figure 1 for Learning to See in the Dark

Figure 2 for Learning to See in the Dark

Figure 3 for Learning to See in the Dark

Figure 4 for Learning to See in the Dark

Abstract:Imaging in low light is challenging due to low photon count and low SNR. Short-exposure images suffer from noise, while long exposure can induce blur and is often impractical. A variety of denoising, deblurring, and enhancement techniques have been proposed, but their effectiveness is limited in extreme conditions, such as video-rate imaging at night. To support the development of learning-based pipelines for low-light image processing, we introduce a dataset of raw short-exposure low-light images, with corresponding long-exposure reference images. Using the presented dataset, we develop a pipeline for processing low-light images, based on end-to-end training of a fully-convolutional network. The network operates directly on raw sensor data and replaces much of the traditional image processing pipeline, which tends to perform poorly on such data. We report promising results on the new dataset, analyze factors that affect performance, and highlight opportunities for future work. The results are shown in the supplementary video at https://youtu.be/qWKUFK7MWvg

* Published at the Conference on Computer Vision and Pattern Recognition (CVPR 2018)

Via

Access Paper or Ask Questions