Alert button
Picture for Samuel Kessler

Samuel Kessler

Alert button

On Sequential Bayesian Inference for Continual Learning

Jan 04, 2023
Samuel Kessler, Adam Cobb, Tim G. J. Rudner, Stefan Zohren, Stephen J. Roberts

Figure 1 for On Sequential Bayesian Inference for Continual Learning
Figure 2 for On Sequential Bayesian Inference for Continual Learning
Figure 3 for On Sequential Bayesian Inference for Continual Learning
Figure 4 for On Sequential Bayesian Inference for Continual Learning

Sequential Bayesian inference can be used for continual learning to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and test whether having access to the true posterior is guaranteed to prevent catastrophic forgetting in Bayesian neural networks. To do this we perform sequential Bayesian inference using Hamiltonian Monte Carlo. We propagate the posterior as a prior for new tasks by fitting a density estimator on Hamiltonian Monte Carlo samples. We find that this approach fails to prevent catastrophic forgetting demonstrating the difficulty in performing sequential Bayesian inference in neural networks. From there we study simple analytical examples of sequential Bayesian inference and CL and highlight the issue of model misspecification which can lead to sub-optimal continual learning performance despite exact inference. Furthermore, we discuss how task data imbalances can cause forgetting. From these limitations, we argue that we need probabilistic models of the continual learning generative process rather than relying on sequential Bayesian inference over Bayesian neural network weights. In this vein, we also propose a simple baseline called Prototypical Bayesian Continual Learning, which is competitive with state-of-the-art Bayesian continual learning methods on class incremental continual learning vision benchmarks.

* 21 pages, 14 figures 
Viaarxiv icon

The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning

Nov 29, 2022
Samuel Kessler, Piotr Miłoś, Jack Parker-Holder, Stephen J. Roberts

Figure 1 for The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning
Figure 2 for The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning
Figure 3 for The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning
Figure 4 for The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning

We study the use of model-based reinforcement learning methods, in particular, world models for continual reinforcement learning. In continual reinforcement learning, an agent is required to solve one task and then another sequentially while retaining performance and preventing forgetting on past tasks. World models offer a task-agnostic solution: they do not require knowledge of task changes. World models are a straight-forward baseline for continual reinforcement learning for three main reasons. Firstly, forgetting in the world model is prevented by persisting existing experience replay buffers across tasks, experience from previous tasks is replayed for learning the world model. Secondly, they are sample efficient. Thirdly and finally, they offer a task-agnostic exploration strategy through the uncertainty in the trajectories generated by the world model. We show that world models are a simple and effective continual reinforcement learning baseline. We study their effectiveness on Minigrid and Minihack continual reinforcement learning benchmarks and show that it outperforms state of the art task-agnostic continual reinforcement learning methods.

* 13 pages, 6 figures, accepted at the Deep RL workshop NeurIPS 22 
Viaarxiv icon

Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition

Feb 07, 2022
Bethan Thomas, Samuel Kessler, Salah Karout

Figure 1 for Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition
Figure 2 for Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition
Figure 3 for Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition
Figure 4 for Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition

Self-supervised learning (SSL) is a powerful tool that allows learning of underlying representations from unlabeled data. Transformer based models such as wav2vec 2.0 and HuBERT are leading the field in the speech domain. Generally these models are fine-tuned on a small amount of labeled data for a downstream task such as Automatic Speech Recognition (ASR). This involves re-training the majority of the model for each task. Adapters are small lightweight modules which are commonly used in Natural Language Processing (NLP) to adapt pre-trained models to new tasks. In this paper we propose applying adapters to wav2vec 2.0 to reduce the number of parameters required for downstream ASR tasks, and increase scalability of the model to multiple tasks or languages. Using adapters we can perform ASR while training fewer than 10% of parameters per task compared to full fine-tuning with little degradation of performance. Ablations show that applying adapters into just the top few layers of the pre-trained network gives similar performance to full transfer, supporting the theory that higher pre-trained layers encode more phonemic information, and further optimizing efficiency.

* 5 Pages, 4 figures. Accepted to ICASSP 2022 
Viaarxiv icon

Continual-wav2vec2: an Application of Continual Learning for Self-Supervised Automatic Speech Recognition

Jul 26, 2021
Samuel Kessler, Bethan Thomas, Salah Karout

Figure 1 for Continual-wav2vec2: an Application of Continual Learning for Self-Supervised Automatic Speech Recognition
Figure 2 for Continual-wav2vec2: an Application of Continual Learning for Self-Supervised Automatic Speech Recognition
Figure 3 for Continual-wav2vec2: an Application of Continual Learning for Self-Supervised Automatic Speech Recognition
Figure 4 for Continual-wav2vec2: an Application of Continual Learning for Self-Supervised Automatic Speech Recognition

We present a method for continual learning of speech representations for multiple languages using self-supervised learning (SSL) and applying these for automatic speech recognition. There is an abundance of unannotated speech, so creating self-supervised representations from raw audio and finetuning on a small annotated datasets is a promising direction to build speech recognition systems. Wav2vec models perform SSL on raw audio in a pretraining phase and then finetune on a small fraction of annotated data. SSL models have produced state of the art results for ASR. However, these models are very expensive to pretrain with self-supervision. We tackle the problem of learning new language representations continually from audio without forgetting a previous language representation. We use ideas from continual learning to transfer knowledge from a previous task to speed up pretraining a new language task. Our continual-wav2vec2 model can decrease pretraining times by 32% when learning a new language task, and learn this new audio-language representation without forgetting previous language representation.

* 11 pages, 9 figures including references and appendix. Accepted at ICML 2021 Workshop: Self-Supervised Learning for Reasoning and Perception 
Viaarxiv icon

Same State, Different Task: Continual Reinforcement Learning without Interference

Jun 05, 2021
Samuel Kessler, Jack Parker-Holder, Philip Ball, Stefan Zohren, Stephen J. Roberts

Figure 1 for Same State, Different Task: Continual Reinforcement Learning without Interference
Figure 2 for Same State, Different Task: Continual Reinforcement Learning without Interference
Figure 3 for Same State, Different Task: Continual Reinforcement Learning without Interference
Figure 4 for Same State, Different Task: Continual Reinforcement Learning without Interference

Continual Learning (CL) considers the problem of training an agent sequentially on a set of tasks while seeking to retain performance on all previous tasks. A key challenge in CL is catastrophic forgetting, which arises when performance on a previously mastered task is reduced when learning a new task. While a variety of methods exist to combat forgetting, in some cases tasks are fundamentally incompatible with each other and thus cannot be learnt by a single policy. This can occur, in reinforcement learning (RL) when an agent may be rewarded for achieving different goals from the same observation. In this paper we formalize this ``interference'' as distinct from the problem of forgetting. We show that existing CL methods based on single neural network predictors with shared replay buffers fail in the presence of interference. Instead, we propose a simple method, OWL, to address this challenge. OWL learns a factorized policy, using shared feature extraction layers, but separate heads, each specializing on a new task. The separate heads in OWL are used to prevent interference. At test time, we formulate policy selection as a multi-armed bandit problem, and show it is possible to select the best policy for an unknown task using feedback from the environment. The use of bandit algorithms allows the OWL agent to constructively re-use different continually learnt policies at different times during an episode. We show in multiple RL environments that existing replay based CL methods fail, while OWL is able to achieve close to optimal performance when training sequentially.

* 20 pages, 12 figures 
Viaarxiv icon

Indian Buffet Neural Networks for Continual Learning

Dec 04, 2019
Samuel Kessler, Vu Nguyen, Stefan Zohren, Stephen Roberts

Figure 1 for Indian Buffet Neural Networks for Continual Learning
Figure 2 for Indian Buffet Neural Networks for Continual Learning
Figure 3 for Indian Buffet Neural Networks for Continual Learning
Figure 4 for Indian Buffet Neural Networks for Continual Learning

We place an Indian Buffet Process (IBP) prior over the neural structure of a Bayesian Neural Network (BNN), thus allowing the complexity of the BNN to increase and decrease automatically. We apply this methodology to the problem of resource allocation in continual learning, where new tasks occur and the network requires extra resources. Our BNN exploits online variational inference with relaxations to the Bernoulli and Beta distributions (which constitute the IBP prior), so allowing the use of the reparameterisation trick to learn variational posteriors via gradient-based methods. As we automatically learn the number of weights in the BNN, overfitting and underfitting problems are largely overcome. We show empirically that the method offers competitive results compared to Variational Continual Learning (VCL) in some settings.

* Camera-ready submission 
Viaarxiv icon