Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joel Veness

Gated Linear Networks

Sep 30, 2019
Joel Veness, Tor Lattimore, Avishkar Bhoopchand, David Budden, Christopher Mattern, Agnieszka Grabska-Barwinska, Peter Toth, Simon Schmitt, Marcus Hutter

This paper presents a family of backpropagation-free neural architectures, Gated Linear Networks (GLNs),that are well suited to online learning applications where sample efficiency is of paramount importance. The impressive empirical performance of these architectures has long been known within the data compression community, but a theoretically satisfying explanation as to how and why they perform so well has proven difficult. What distinguishes these architectures from other neural systems is the distributed and local nature of their credit assignment mechanism; each neuron directly predicts the target and has its own set of hard-gated weights that are locally adapted via online convex optimization. By providing an interpretation, generalization and subsequent theoretical analysis, we show that sufficiently large GLNs are universal in a strong sense: not only can they model any compactly supported, continuous density function to arbitrary accuracy, but that any choice of no-regret online convex optimization technique will provably converge to the correct solution with enough data. Empirically we show a collection of single-pass learning results on established machine learning benchmarks that are competitive with results obtained with general purpose batch learning techniques.

* arXiv admin note: substantial text overlap with arXiv:1712.01897

Via

Access Paper or Ask Questions

Meta-learning of Sequential Strategies

May 08, 2019
Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

Figure 1 for Meta-learning of Sequential Strategies

Figure 2 for Meta-learning of Sequential Strategies

Figure 3 for Meta-learning of Sequential Strategies

Figure 4 for Meta-learning of Sequential Strategies

In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal predictors and reinforcement learners which behave as if they had a probabilistic model that allowed them to efficiently exploit task structure. Furthermore, we recast memory-based meta-learning within a Bayesian framework, showing that the meta-learned strategies are near-optimal because they amortize Bayes-filtered data, where the adaptation is implemented in the memory dynamics as a state-machine of sufficient statistics. Essentially, memory-based meta-learning translates the hard problem of probabilistic sequential inference into a regression problem.

* DeepMind Technical Report (15 pages, 6 figures)

Via

Access Paper or Ask Questions

Online Learning with Gated Linear Networks

Dec 05, 2017
Joel Veness, Tor Lattimore, Avishkar Bhoopchand, Agnieszka Grabska-Barwinska, Christopher Mattern, Peter Toth

Figure 1 for Online Learning with Gated Linear Networks

Figure 2 for Online Learning with Gated Linear Networks

Figure 3 for Online Learning with Gated Linear Networks

Figure 4 for Online Learning with Gated Linear Networks

This paper describes a family of probabilistic architectures designed for online learning under the logarithmic loss. Rather than relying on non-linear transfer functions, our method gains representational power by the use of data conditioning. We state under general conditions a learnable capacity theorem that shows this approach can in principle learn any bounded Borel-measurable function on a compact subset of euclidean space; the result is stronger than many universality results for connectionist architectures because we provide both the model and the learning procedure for which convergence is guaranteed.

* 40 pages

Via

Access Paper or Ask Questions

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

Dec 01, 2017
Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling

Figure 1 for Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

Figure 2 for Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

Figure 3 for Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

Figure 4 for Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving increasing attention from the scientific community, leading to some high-profile success stories such as the much publicized Deep Q-Networks (DQN). In this article we take a big picture look at how the ALE is being used by the research community. We show how diverse the evaluation methodologies in the ALE have become with time, and highlight some key concerns when evaluating agents in the ALE. We use this discussion to present some methodological best practices and provide new benchmark results using these best practices. To further the progress in the field, we introduce a new version of the ALE that supports multiple game modes and provides a form of stochasticity we call sticky actions. We conclude this big picture look by revisiting challenges posed when the ALE was introduced, summarizing the state-of-the-art in various problems and highlighting problems that remain open.

Via

Access Paper or Ask Questions

Overcoming catastrophic forgetting in neural networks

Jan 25, 2017
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, Raia Hadsell

Figure 1 for Overcoming catastrophic forgetting in neural networks

Figure 2 for Overcoming catastrophic forgetting in neural networks

Figure 3 for Overcoming catastrophic forgetting in neural networks

Figure 4 for Overcoming catastrophic forgetting in neural networks

The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks which they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on the MNIST hand written digit dataset and by learning several Atari 2600 games sequentially.

Via

Access Paper or Ask Questions

Compress and Control

Nov 19, 2014
Joel Veness, Marc G. Bellemare, Marcus Hutter, Alvin Chua, Guillaume Desjardins

This paper describes a new information-theoretic policy evaluation technique for reinforcement learning. This technique converts any compression or density model into a corresponding estimate of value. Under appropriate stationarity and ergodicity conditions, we show that the use of a sufficiently powerful model gives rise to a consistent value function estimator. We also study the behavior of this technique when applied to various Atari 2600 video games, where the use of suboptimal modeling techniques is unavoidable. We consider three fundamentally different models, all too limited to perfectly model the dynamics of the system. Remarkably, we find that our technique provides sufficiently accurate value estimates for effective on-policy control. We conclude with a suggestive study highlighting the potential of our technique to scale to large problems.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

Online Learning of k-CNF Boolean Functions

Mar 26, 2014
Joel Veness, Marcus Hutter

Figure 1 for Online Learning of k-CNF Boolean Functions

Figure 2 for Online Learning of k-CNF Boolean Functions

Figure 3 for Online Learning of k-CNF Boolean Functions

Figure 4 for Online Learning of k-CNF Boolean Functions

This paper revisits the problem of learning a k-CNF Boolean function from examples in the context of online learning under the logarithmic loss. In doing so, we give a Bayesian interpretation to one of Valiant's celebrated PAC learning algorithms, which we then build upon to derive two efficient, online, probabilistic, supervised learning algorithms for predicting the output of an unknown k-CNF Boolean function. We analyze the loss of our methods, and show that the cumulative log-loss can be upper bounded, ignoring logarithmic factors, by a polynomial function of the size of each example.

* 20 LaTeX pages. 2 Algorithms. Some Theorems

Via

Access Paper or Ask Questions

The Arcade Learning Environment: An Evaluation Platform for General Agents

Jun 21, 2013
Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling

Figure 1 for The Arcade Learning Environment: An Evaluation Platform for General Agents

Figure 2 for The Arcade Learning Environment: An Evaluation Platform for General Agents

Figure 3 for The Arcade Learning Environment: An Evaluation Platform for General Agents

Figure 4 for The Arcade Learning Environment: An Evaluation Platform for General Agents

In this article we introduce the Arcade Learning Environment (ALE): both a challenge problem and a platform and methodology for evaluating the development of general, domain-independent AI technology. ALE provides an interface to hundreds of Atari 2600 game environments, each one different, interesting, and designed to be a challenge for human players. ALE presents significant research challenges for reinforcement learning, model learning, model-based planning, imitation learning, transfer learning, and intrinsic motivation. Most importantly, it provides a rigorous testbed for evaluating and comparing approaches to these problems. We illustrate the promise of ALE by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning. In doing so, we also propose an evaluation methodology made possible by ALE, reporting empirical results on over 55 different games. All of the software, including the benchmark agents, is publicly available.

* Journal of Artificial Intelligence Research 47, pages 253-279

Via

Access Paper or Ask Questions

Partition Tree Weighting

Nov 21, 2012
Joel Veness, Martha White, Michael Bowling, András György

This paper introduces the Partition Tree Weighting technique, an efficient meta-algorithm for piecewise stationary sources. The technique works by performing Bayesian model averaging over a large class of possible partitions of the data into locally stationary segments. It uses a prior, closely related to the Context Tree Weighting technique of Willems, that is well suited to data compression applications. Our technique can be applied to any coding distribution at an additional time and space cost only logarithmic in the sequence length. We provide a competitive analysis of the redundancy of our method, and explore its application in a variety of settings. The order of the redundancy and the complexity of our algorithm matches those of the best competitors available in the literature, and the new algorithm exhibits a superior complexity-performance trade-off in our experiments.

Via

Access Paper or Ask Questions

An Approximation of the Universal Intelligence Measure

Sep 29, 2011
Shane Legg, Joel Veness

Figure 1 for An Approximation of the Universal Intelligence Measure

Figure 2 for An Approximation of the Universal Intelligence Measure

Figure 3 for An Approximation of the Universal Intelligence Measure

Figure 4 for An Approximation of the Universal Intelligence Measure

The Universal Intelligence Measure is a recently proposed formal definition of intelligence. It is mathematically specified, extremely general, and captures the essence of many informal definitions of intelligence. It is based on Hutter's Universal Artificial Intelligence theory, an extension of Ray Solomonoff's pioneering work on universal induction. Since the Universal Intelligence Measure is only asymptotically computable, building a practical intelligence test from it is not straightforward. This paper studies the practical issues involved in developing a real-world UIM-based performance metric. Based on our investigation, we develop a prototype implementation which we use to evaluate a number of different artificial agents.

* 14 pages

Via

Access Paper or Ask Questions