Alert button
Picture for Neil Lawrence

Neil Lawrence

Alert button

Bayesian Learning via Neural Schrödinger-Föllmer Flows

Nov 29, 2021
Francisco Vargas, Andrius Ovsianas, David Fernandes, Mark Girolami, Neil Lawrence, Nikolas Nüsken

Figure 1 for Bayesian Learning via Neural Schrödinger-Föllmer Flows
Figure 2 for Bayesian Learning via Neural Schrödinger-Föllmer Flows
Figure 3 for Bayesian Learning via Neural Schrödinger-Föllmer Flows

In this work we explore a new framework for approximate Bayesian inference in large datasets based on stochastic control. We advocate stochastic control as a finite time alternative to popular steady-state methods such as stochastic gradient Langevin dynamics (SGLD). Furthermore, we discuss and adapt the existing theoretical guarantees of this framework and establish connections to already existing VI routines in SDE-based models.

Viaarxiv icon

Meta-Surrogate Benchmarking for Hyperparameter Optimization

May 30, 2019
Aaron Klein, Zhenwen Dai, Frank Hutter, Neil Lawrence, Javier Gonzalez

Figure 1 for Meta-Surrogate Benchmarking for Hyperparameter Optimization
Figure 2 for Meta-Surrogate Benchmarking for Hyperparameter Optimization
Figure 3 for Meta-Surrogate Benchmarking for Hyperparameter Optimization
Figure 4 for Meta-Surrogate Benchmarking for Hyperparameter Optimization

Despite the recent progress in hyperparameter optimization (HPO), available benchmarks that resemble real-world scenarios consist of a few and very large problem instances that are expensive to solve. This blocks researchers and practitioners not only from systematically running large-scale comparisons that are needed to draw statistically significant results but also from reproducing experiments that were conducted before. This work proposes a method to alleviate these issues by means of a meta-surrogate model for HPO tasks trained on off-line generated data. The model combines a probabilistic encoder with a multi-task model such that it can generate inexpensive and realistic tasks of the class of problems of interest. We demonstrate that benchmarking HPO methods on samples of the generative model allows us to draw more coherent and statistically significant conclusions that can be reached orders of magnitude faster than using the original tasks. We provide evidence of our findings for various HPO methods on a wide class of problems.

Viaarxiv icon

Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients

Apr 24, 2019
Yu Chen, Tom Diethe, Neil Lawrence

Figure 1 for Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients
Figure 2 for Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients
Figure 3 for Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients
Figure 4 for Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients

Continual learning aims to enable machine learning models to learn a general solution space for past and future tasks in a sequential manner. Conventional models tend to forget the knowledge of previous tasks while learning a new task, a phenomenon known as catastrophic forgetting. When using Bayesian models in continual learning, knowledge from previous tasks can be retained in two ways: 1). posterior distributions over the parameters, containing the knowledge gained from inference in previous tasks, which then serve as the priors for the following task; 2). coresets, containing knowledge of data distributions of previous tasks. Here, we show that Bayesian continual learning can be facilitated in terms of these two means through the use of natural gradients and Stein gradients respectively.

* Continual Learning Workshop of 32nd Conference on Neural Information Processing Systems (NeurIPS 2018)  
Viaarxiv icon

Continual Learning in Practice

Mar 18, 2019
Tom Diethe, Tom Borchert, Eno Thereska, Borja Balle, Neil Lawrence

Figure 1 for Continual Learning in Practice

This paper describes a reference architecture for self-maintaining systems that can learn continually, as data arrives. In environments where data evolves, we need architectures that manage Machine Learning (ML) models in production, adapt to shifting data distributions, cope with outliers, retrain when necessary, and adapt to new tasks. This represents continual AutoML or Automatically Adaptive Machine Learning. We describe the challenges and proposes a reference architecture.

* Presented at the NeurIPS 2018 workshop on Continual Learning https://sites.google.com/view/continual2018/home 
Viaarxiv icon

Deep Gaussian Processes for Multi-fidelity Modeling

Mar 18, 2019
Kurt Cutajar, Mark Pullin, Andreas Damianou, Neil Lawrence, Javier González

Figure 1 for Deep Gaussian Processes for Multi-fidelity Modeling
Figure 2 for Deep Gaussian Processes for Multi-fidelity Modeling
Figure 3 for Deep Gaussian Processes for Multi-fidelity Modeling
Figure 4 for Deep Gaussian Processes for Multi-fidelity Modeling

Multi-fidelity methods are prominently used when cheaply-obtained, but possibly biased and noisy, observations must be effectively combined with limited or expensive true data in order to construct reliable models. This arises in both fundamental machine learning procedures such as Bayesian optimization, as well as more practical science and engineering applications. In this paper we develop a novel multi-fidelity model which treats layers of a deep Gaussian process as fidelity levels, and uses a variational inference scheme to propagate uncertainty across them. This allows for capturing nonlinear correlations between fidelities with lower risk of overfitting than existing methods exploiting compositional structure, which are conversely burdened by structural assumptions and constraints. We show that the proposed approach makes substantial improvements in quantifying and propagating uncertainty in multi-fidelity set-ups, which in turn improves their effectiveness in decision making pipelines.

Viaarxiv icon

Intrinsic Gaussian processes on complex constrained domains

Jan 03, 2018
Mu Niu, Pokman Cheung, Lizhen Lin, Zhenwen Dai, Neil Lawrence, David Dunson

Figure 1 for Intrinsic Gaussian processes on complex constrained domains
Figure 2 for Intrinsic Gaussian processes on complex constrained domains
Figure 3 for Intrinsic Gaussian processes on complex constrained domains
Figure 4 for Intrinsic Gaussian processes on complex constrained domains

We propose a class of intrinsic Gaussian processes (in-GPs) for interpolation, regression and classification on manifolds with a primary focus on complex constrained domains or irregular shaped spaces arising as subsets or submanifolds of R, R2, R3 and beyond. For example, in-GPs can accommodate spatial domains arising as complex subsets of Euclidean space. in-GPs respect the potentially complex boundary or interior conditions as well as the intrinsic geometry of the spaces. The key novelty of the proposed approach is to utilise the relationship between heat kernels and the transition density of Brownian motion on manifolds for constructing and approximating valid and computationally feasible covariance kernels. This enables in-GPs to be practically applied in great generality, while existing approaches for smoothing on constrained domains are limited to simple special cases. The broad utilities of the in-GP approach is illustrated through simulation studies and data examples.

Viaarxiv icon

Parallelizable sparse inverse formulation Gaussian processes (SpInGP)

Sep 28, 2017
Alexander Grigorievskiy, Neil Lawrence, Simo Särkkä

Figure 1 for Parallelizable sparse inverse formulation Gaussian processes (SpInGP)
Figure 2 for Parallelizable sparse inverse formulation Gaussian processes (SpInGP)

We propose a parallelizable sparse inverse formulation Gaussian process (SpInGP) for temporal models. It uses a sparse precision GP formulation and sparse matrix routines to speed up the computations. Due to the state-space formulation used in the algorithm, the time complexity of the basic SpInGP is linear, and because all the computations are parallelizable, the parallel form of the algorithm is sublinear in the number of data points. We provide example algorithms to implement the sparse matrix routines and experimentally test the method using both simulated and real data.

* Presented at Machine Learning in Signal Processing (MLSP2017) 
Viaarxiv icon