Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Koray Kavukcuoglu

Dima

Asynchronous Methods for Deep Reinforcement Learning

Jun 16, 2016

Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu

Figure 1 for Asynchronous Methods for Deep Reinforcement Learning

Figure 2 for Asynchronous Methods for Deep Reinforcement Learning

Figure 3 for Asynchronous Methods for Deep Reinforcement Learning

Figure 4 for Asynchronous Methods for Deep Reinforcement Learning

Abstract:We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

* ICML 2016

Via

Access Paper or Ask Questions

Strategic Attentive Writer for Learning Macro-Actions

Jun 15, 2016

Alexander, Vezhnevets, Volodymyr Mnih, John Agapiou, Simon Osindero, Alex Graves, Oriol Vinyals, Koray Kavukcuoglu

Figure 1 for Strategic Attentive Writer for Learning Macro-Actions

Figure 2 for Strategic Attentive Writer for Learning Macro-Actions

Figure 3 for Strategic Attentive Writer for Learning Macro-Actions

Figure 4 for Strategic Attentive Writer for Learning Macro-Actions

Abstract:We present a novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner by purely interacting with an environment in reinforcement learning setting. The network builds an internal plan, which is continuously updated upon observation of the next input from the environment. It can also partition this internal representation into contiguous sub- sequences by learning for how long the plan can be committed to - i.e. followed without re-planing. Combining these properties, the proposed model, dubbed STRategic Attentive Writer (STRAW) can learn high-level, temporally abstracted macro- actions of varying lengths that are solely learnt from data without any prior information. These macro-actions enable both structured exploration and economic computation. We experimentally demonstrate that STRAW delivers strong improvements on several ATARI games by employing temporally extended planning strategies (e.g. Ms. Pacman and Frostbite). It is at the same time a general algorithm that can be applied on any sequence data. To that end, we also show that when trained on text prediction task, STRAW naturally predicts frequent n-grams (instead of macro-actions), demonstrating the generality of the approach.

Via

Access Paper or Ask Questions

Exploiting Cyclic Symmetry in Convolutional Neural Networks

May 26, 2016

Sander Dieleman, Jeffrey De Fauw, Koray Kavukcuoglu

Figure 1 for Exploiting Cyclic Symmetry in Convolutional Neural Networks

Figure 2 for Exploiting Cyclic Symmetry in Convolutional Neural Networks

Figure 3 for Exploiting Cyclic Symmetry in Convolutional Neural Networks

Figure 4 for Exploiting Cyclic Symmetry in Convolutional Neural Networks

Abstract:Many classes of images exhibit rotational symmetry. Convolutional neural networks are sometimes trained using data augmentation to exploit this, but they are still required to learn the rotation equivariance properties from the data. Encoding these properties into the network architecture, as we are already used to doing for translation equivariance by using convolutional layers, could result in a more efficient use of the parameter budget by relieving the model from learning them. We introduce four operations which can be inserted into neural network models as layers, and which can be combined to make these models partially equivariant to rotations. They also enable parameter sharing across different orientations. We evaluate the effect of these architectural modifications on three datasets which exhibit rotational symmetry and demonstrate improved performance with smaller models.

* 10 pages, 6 figures, accepted for publication at ICML 2016

Via

Access Paper or Ask Questions

Spatial Transformer Networks

Feb 04, 2016

Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu

Figure 1 for Spatial Transformer Networks

Figure 2 for Spatial Transformer Networks

Figure 3 for Spatial Transformer Networks

Figure 4 for Spatial Transformer Networks

Abstract:Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner. In this work we introduce a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network. This differentiable module can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps, conditional on the feature map itself, without any extra training supervision or modification to the optimisation process. We show that the use of spatial transformers results in models which learn invariance to translation, scale, rotation and more generic warping, resulting in state-of-the-art performance on several benchmarks, and for a number of classes of transformations.

Via

Access Paper or Ask Questions

Policy Distillation

Jan 07, 2016

Andrei A. Rusu, Sergio Gomez Colmenarejo, Caglar Gulcehre, Guillaume Desjardins, James Kirkpatrick, Razvan Pascanu, Volodymyr Mnih, Koray Kavukcuoglu, Raia Hadsell

Abstract:Policies for complex visual tasks have been successfully learned with deep reinforcement learning, using an approach called deep Q-networks (DQN), but relatively large (task-specific) networks and extensive training are needed to achieve good performance. In this work, we present a novel method called policy distillation that can be used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient. Furthermore, the same method can be used to consolidate multiple task-specific policies into a single policy. We demonstrate these claims using the Atari domain and show that the multi-task distilled agent outperforms the single-task teachers as well as a jointly-trained DQN agent.

* Submitted to ICLR 2016

Via

Access Paper or Ask Questions

Massively Parallel Methods for Deep Reinforcement Learning

Jul 16, 2015

Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen(+4 more)

Figure 1 for Massively Parallel Methods for Deep Reinforcement Learning

Figure 2 for Massively Parallel Methods for Deep Reinforcement Learning

Figure 3 for Massively Parallel Methods for Deep Reinforcement Learning

Figure 4 for Massively Parallel Methods for Deep Reinforcement Learning

Abstract:We present the first massively distributed architecture for deep reinforcement learning. This architecture uses four main components: parallel actors that generate new behaviour; parallel learners that are trained from stored experience; a distributed neural network to represent the value function or behaviour policy; and a distributed store of experience. We used our architecture to implement the Deep Q-Network algorithm (DQN). Our distributed algorithm was applied to 49 games from Atari 2600 games from the Arcade Learning Environment, using identical hyperparameters. Our performance surpassed non-distributed DQN in 41 of the 49 games and also reduced the wall-time required to achieve these results by an order of magnitude on most games.

* Presented at the Deep Learning Workshop, International Conference on Machine Learning, Lille, France, 2015

Via

Access Paper or Ask Questions

Natural Neural Networks

Jul 01, 2015

Guillaume Desjardins, Karen Simonyan, Razvan Pascanu, Koray Kavukcuoglu

Abstract:We introduce Natural Neural Networks, a novel family of algorithms that speed up convergence by adapting their internal representation during training to improve conditioning of the Fisher matrix. In particular, we show a specific example that employs a simple and efficient reparametrization of the neural network weights by implicitly whitening the representation obtained at each layer, while preserving the feed-forward computation of the network. Such networks can be trained efficiently via the proposed Projected Natural Gradient Descent algorithm (PRONG), which amortizes the cost of these reparametrizations over many parameter updates and is closely related to the Mirror Descent online learning algorithm. We highlight the benefits of our method on both unsupervised and supervised learning tasks, and showcase its scalability by training on the large-scale ImageNet Challenge dataset.

Via

Access Paper or Ask Questions

Weight Uncertainty in Neural Networks

May 21, 2015

Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra

Figure 1 for Weight Uncertainty in Neural Networks

Figure 2 for Weight Uncertainty in Neural Networks

Figure 3 for Weight Uncertainty in Neural Networks

Figure 4 for Weight Uncertainty in Neural Networks

Abstract:We introduce a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop. It regularises the weights by minimising a compression cost, known as the variational free energy or the expected lower bound on the marginal likelihood. We show that this principled kind of regularisation yields comparable performance to dropout on MNIST classification. We then demonstrate how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems, and how this weight uncertainty can be used to drive the exploration-exploitation trade-off in reinforcement learning.

* In Proceedings of the 32nd International Conference on Machine Learning (ICML 2015)

Via

Access Paper or Ask Questions

Multiple Object Recognition with Visual Attention

Apr 23, 2015

Jimmy Ba, Volodymyr Mnih, Koray Kavukcuoglu

Figure 1 for Multiple Object Recognition with Visual Attention

Figure 2 for Multiple Object Recognition with Visual Attention

Figure 3 for Multiple Object Recognition with Visual Attention

Figure 4 for Multiple Object Recognition with Visual Attention

Abstract:We present an attention-based model for recognizing multiple objects in images. The proposed model is a deep recurrent neural network trained with reinforcement learning to attend to the most relevant regions of the input image. We show that the model learns to both localize and recognize multiple objects despite being given only class labels during training. We evaluate the model on the challenging task of transcribing house number sequences from Google Street View images and show that it is both more accurate than the state-of-the-art convolutional networks and uses fewer parameters and less computation.

Via

Access Paper or Ask Questions

Recurrent Models of Visual Attention

Jun 24, 2014

Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu

Figure 1 for Recurrent Models of Visual Attention

Figure 2 for Recurrent Models of Visual Attention

Figure 3 for Recurrent Models of Visual Attention

Figure 4 for Recurrent Models of Visual Attention

Abstract:Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is capable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it performs can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn task-specific policies. We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so.

Via

Access Paper or Ask Questions