Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael A. Osborne

Knowing The What But Not The Where in Bayesian Optimization

May 11, 2019

Vu Nguyen, Michael A. Osborne

Figure 1 for Knowing The What But Not The Where in Bayesian Optimization

Figure 2 for Knowing The What But Not The Where in Bayesian Optimization

Figure 3 for Knowing The What But Not The Where in Bayesian Optimization

Figure 4 for Knowing The What But Not The Where in Bayesian Optimization

Abstract:Bayesian optimization has demonstrated impressive success in finding the optimum location $x^{*}$ and value $f^{*}=f(x^{*})=\max_{x\in\mathcal{X}}f(x)$ of the black-box function $f$. In some applications, however, the optimum value is known in advance and the goal is to find the corresponding optimum location. Existing work in Bayesian optimization (BO) has not effectively exploited the knowledge of $f^{*}$ for optimization. In this paper, we consider a new setting in BO in which the knowledge of the optimum value is available. Our goal is to exploit the knowledge about $f^{*}$ to search for the location $x^{*}$ efficiently. To achieve this goal, we first transform the Gaussian process surrogate using the information about the optimum value. Then, we propose two acquisition functions, called confidence bound minimization and expected regret minimization, which exploit the knowledge about the optimum value to identify the optimum location efficiently. We show that our approaches work both intuitively and quantitatively achieve better performance against standard BO methods. We demonstrate real applications in tuning a deep reinforcement learning algorithm on the CartPole problem and XGBoost on Skin Segmentation dataset in which the optimum values are publicly available.

* 15 pages

Via

Access Paper or Ask Questions

Automated Model Selection with Bayesian Quadrature

Mar 01, 2019

Henry Chai, Jean-Francois Ton, Roman Garnett, Michael A. Osborne

Figure 1 for Automated Model Selection with Bayesian Quadrature

Figure 2 for Automated Model Selection with Bayesian Quadrature

Figure 3 for Automated Model Selection with Bayesian Quadrature

Figure 4 for Automated Model Selection with Bayesian Quadrature

Abstract:We present a novel technique for tailoring Bayesian quadrature (BQ) to model selection. The state-of-the-art for comparing the evidence of multiple models relies on Monte Carlo methods, which converge slowly and are unreliable for computationally expensive models. Previous research has shown that BQ offers sample efficiency superior to Monte Carlo in computing the evidence of an individual model. However, applying BQ directly to model comparison may waste computation producing an overly-accurate estimate for the evidence of a clearly poor model. We propose an automated and efficient algorithm for computing the most-relevant quantity for model selection: the posterior probability of a model. Our technique maximizes the mutual information between this quantity and observations of the models' likelihoods, yielding efficient acquisition of samples across disparate model spaces when likelihood observations are limited. Our method produces more-accurate model posterior estimates using fewer model likelihood evaluations than standard Bayesian quadrature and Monte Carlo estimators, as we demonstrate on synthetic and real-world examples.

* 10 pages, 5 figures. Currently in submission to ICML 2019

Via

Access Paper or Ask Questions

Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation

Jan 30, 2019

Ahsan S. Alvi, Binxin Ru, Jan Calliess, Stephen J. Roberts, Michael A. Osborne

Figure 1 for Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation

Figure 2 for Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation

Figure 3 for Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation

Figure 4 for Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation

Abstract:Batch Bayesian optimisation (BO) has been successfully applied to hyperparameter tuning using parallel computing, but it is wasteful of resources: workers that complete jobs ahead of others are left idle. We address this problem by developing an approach, Penalising Locally for Asynchronous Bayesian Optimisation on $k$ workers (PLAyBOOK), for asynchronous parallel BO. We demonstrate empirically the efficacy of PLAyBOOK and its variants on synthetic tasks and a real-world problem. We undertake a comparison between synchronous and asynchronous BO, and show that asynchronous BO often outperforms synchronous batch BO in both wall-clock time and number of function evaluations.

* formatting edits

Via

Access Paper or Ask Questions

Rejoinder for "Probabilistic Integration: A Role in Statistical Computation?"

Nov 26, 2018

Francois-Xavier Briol, Chris J. Oates, Mark Girolami, Michael A. Osborne, Dino Sejdinovic

Abstract:This article is the rejoinder for the paper "Probabilistic Integration: A Role in Statistical Computation?" to appear in Statistical Science with discussion. We would first like to thank the reviewers and many of our colleagues who helped shape this paper, the editor for selecting our paper for discussion, and of course all of the discussants for their thoughtful, insightful and constructive comments. In this rejoinder, we respond to some of the points raised by the discussants and comment further on the fundamental questions underlying the paper: (i) Should Bayesian ideas be used in numerical analysis?, and (ii) If so, what role should such approaches have in statistical computation?

* Accepted to Statistical Science

Via

Access Paper or Ask Questions

Fingerprint Policy Optimisation for Robust Reinforcement Learning

Sep 15, 2018

Supratik Paul, Michael A. Osborne, Shimon Whiteson

Figure 1 for Fingerprint Policy Optimisation for Robust Reinforcement Learning

Figure 2 for Fingerprint Policy Optimisation for Robust Reinforcement Learning

Figure 3 for Fingerprint Policy Optimisation for Robust Reinforcement Learning

Figure 4 for Fingerprint Policy Optimisation for Robust Reinforcement Learning

Abstract:Policy gradient methods have been successfully applied to a variety of reinforcement learning tasks. However, while learning in a simulator, these methods do not utilise the opportunity to improve learning by adjusting certain environment variables: unobservable state features that are randomly determined by the environment in a physical setting, but that are controllable in a simulator. This can lead to slow learning or convergence to highly suboptimal policies if the environment variable has a large impact on the transition dynamics. In this paper, we present fingerprint policy optimisation (FPO) which finds a policy that is optimal in expectation across the distribution of environment variables. The central idea is to use Bayesian optimisation (BO) to actively select the distribution of the environment variable that maximises the improvement generated by each iteration of the policy gradient method. To make this BO practical, we contribute two easy-to-compute low-dimensional fingerprints of the current policy. We apply FPO to a number of continuous control tasks of varying difficulty and show that FPO can efficiently learn policies that are robust to significant rare events, which are unlikely to be observable under random sampling but are key to learning good policies.

Via

Access Paper or Ask Questions

Battery health prediction under generalized conditions using a Gaussian process transition model

Jul 17, 2018

Robert R. Richardson, Michael A. Osborne, David A. Howey

Figure 1 for Battery health prediction under generalized conditions using a Gaussian process transition model

Figure 2 for Battery health prediction under generalized conditions using a Gaussian process transition model

Figure 3 for Battery health prediction under generalized conditions using a Gaussian process transition model

Figure 4 for Battery health prediction under generalized conditions using a Gaussian process transition model

Abstract:Accurately predicting the future health of batteries is necessary to ensure reliable operation, minimise maintenance costs, and calculate the value of energy storage investments. The complex nature of degradation renders data-driven approaches a promising alternative to mechanistic modelling. This study predicts the changes in battery capacity over time using a Bayesian non-parametric approach based on Gaussian process regression. These changes can be integrated against an arbitrary input sequence to predict capacity fade in a variety of usage scenarios, forming a generalised health model. The approach naturally incorporates varying current, voltage and temperature inputs, crucial for enabling real world application. A key innovation is the feature selection step, where arbitrary length current, voltage and temperature measurement vectors are mapped to fixed size feature vectors, enabling them to be efficiently used as exogenous variables. The approach is demonstrated on the open-source NASA Randomised Battery Usage Dataset, with data of 26 cells aged under randomized operational conditions. Using half of the cells for training, and half for validation, the method is shown to accurately predict non-linear capacity fade, with a best case normalised root mean square error of 4.3%, including accurate estimation of prediction uncertainty.

Via

Access Paper or Ask Questions

Fast Information-theoretic Bayesian Optimisation

Jun 06, 2018

Binxin Ru, Mark McLeod, Diego Granziol, Michael A. Osborne

Figure 1 for Fast Information-theoretic Bayesian Optimisation

Figure 2 for Fast Information-theoretic Bayesian Optimisation

Abstract:Information-theoretic Bayesian optimisation techniques have demonstrated state-of-the-art performance in tackling important global optimisation problems. However, current information-theoretic approaches require many approximations in implementation, introduce often-prohibitive computational overhead and limit the choice of kernels available to model the objective. We develop a fast information-theoretic Bayesian Optimisation method, FITBO, that avoids the need for sampling the global minimiser, thus significantly reducing computational overhead. Moreover, in comparison with existing approaches, our method faces fewer constraints on kernel choice and enjoys the merits of dealing with the output space. We demonstrate empirically that FITBO inherits the performance associated with information-theoretic Bayesian optimisation, while being even faster than simpler Bayesian optimisation approaches, such as Expected Improvement.

* Main Paper: 9 pages, 6 figures, 2 tables; Accepted by ICML 2018

Via

Access Paper or Ask Questions

Optimization, fast and slow: optimally switching between local and Bayesian optimization

May 22, 2018

Mark McLeod, Michael A. Osborne, Stephen J. Roberts

Figure 1 for Optimization, fast and slow: optimally switching between local and Bayesian optimization

Figure 2 for Optimization, fast and slow: optimally switching between local and Bayesian optimization

Figure 3 for Optimization, fast and slow: optimally switching between local and Bayesian optimization

Figure 4 for Optimization, fast and slow: optimally switching between local and Bayesian optimization

Abstract:We develop the first Bayesian Optimization algorithm, BLOSSOM, which selects between multiple alternative acquisition functions and traditional local optimization at each step. This is combined with a novel stopping condition based on expected regret. This pairing allows us to obtain the best characteristics of both local and Bayesian optimization, making efficient use of function evaluations while yielding superior convergence to the global minimum on a selection of optimization problems, and also halting optimization once a principled and intuitive stopping condition has been fulfilled.

Via

Access Paper or Ask Questions

Practical Bayesian Optimization for Variable Cost Objectives

May 15, 2018

Mark McLeod, Michael A. Osborne, Stephen J. Roberts

Figure 1 for Practical Bayesian Optimization for Variable Cost Objectives

Figure 2 for Practical Bayesian Optimization for Variable Cost Objectives

Figure 3 for Practical Bayesian Optimization for Variable Cost Objectives

Figure 4 for Practical Bayesian Optimization for Variable Cost Objectives

Abstract:We propose a novel Bayesian Optimization approach for black-box functions with an environmental variable whose value determines the tradeoff between evaluation cost and the fidelity of the evaluations. Further, we use a novel approach to sampling support points, allowing faster construction of the acquisition function. This allows us to achieve optimization with lower overheads than previous approaches and is implemented for a more general class of problem. We show this approach to be effective on synthetic and real world benchmark problems.

* 8 pages, 7 figures

Via

Access Paper or Ask Questions

Distributionally Ambiguous Optimization Techniques for Batch Bayesian Optimization

Apr 16, 2018

Nikitas Rontsis, Michael A. Osborne, Paul J. Goulart

Figure 1 for Distributionally Ambiguous Optimization Techniques for Batch Bayesian Optimization

Figure 2 for Distributionally Ambiguous Optimization Techniques for Batch Bayesian Optimization

Figure 3 for Distributionally Ambiguous Optimization Techniques for Batch Bayesian Optimization

Figure 4 for Distributionally Ambiguous Optimization Techniques for Batch Bayesian Optimization

Abstract:We propose a novel, theoretically-grounded, acquisition function for Batch Bayesian optimization informed by insights from distributionally ambiguous optimization. Our acquisition function is a lower bound on the well-known Expected Improvement function, which requires evaluation of a Gaussian Expectation over a multivariate piecewise affine function. Our bound is computed instead by evaluating the best-case expectation over all probability distributions consistent with the same mean and variance as the original Gaussian distribution. Unlike alternative approaches, including Expected Improvement, our proposed acquisition function avoids multi-dimensional integrations entirely, and can be computed exactly - even on large batch sizes - as the solution of a tractable convex optimization problem. Our suggested acquisition function can also be optimized efficiently, since first and second derivative information can be calculated inexpensively as by-products of the acquisition function calculation itself. We derive various novel theorems that ground our work theoretically and we demonstrate superior performance via simple motivating examples, benchmark functions and real-world problems.

Via

Access Paper or Ask Questions