Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Maximilian Balandat

Bayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddings

Mar 03, 2023

Aryan Deshwal, Sebastian Ament, Maximilian Balandat, Eytan Bakshy, Janardhan Rao Doppa, David Eriksson

Figure 1 for Bayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddings

Figure 2 for Bayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddings

Figure 3 for Bayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddings

Figure 4 for Bayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddings

Abstract:We consider the problem of optimizing expensive black-box functions over high-dimensional combinatorial spaces which arises in many science, engineering, and ML applications. We use Bayesian Optimization (BO) and propose a novel surrogate modeling approach for efficiently handling a large number of binary and categorical parameters. The key idea is to select a number of discrete structures from the input space (the dictionary) and use them to define an ordinal embedding for high-dimensional combinatorial structures. This allows us to use existing Gaussian process models for continuous spaces. We develop a principled approach based on binary wavelets to construct dictionaries for binary spaces, and propose a randomized construction method that generalizes to categorical spaces. We provide theoretical justification to support the effectiveness of the dictionary-based embeddings. Our experiments on diverse real-world benchmarks demonstrate the effectiveness of our proposed surrogate modeling approach over state-of-the-art BO methods.

* Appearing in AISTATS 2023

Via

Access Paper or Ask Questions

Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization

Oct 18, 2022

Samuel Daulton, Xingchen Wan, David Eriksson, Maximilian Balandat, Michael A. Osborne, Eytan Bakshy

Figure 1 for Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization

Figure 2 for Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization

Figure 3 for Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization

Figure 4 for Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterization

Abstract:Optimizing expensive-to-evaluate black-box functions of discrete (and potentially continuous) design parameters is a ubiquitous problem in scientific and engineering applications. Bayesian optimization (BO) is a popular, sample-efficient method that leverages a probabilistic surrogate model and an acquisition function (AF) to select promising designs to evaluate. However, maximizing the AF over mixed or high-cardinality discrete search spaces is challenging standard gradient-based methods cannot be used directly or evaluating the AF at every point in the search space would be computationally prohibitive. To address this issue, we propose using probabilistic reparameterization (PR). Instead of directly optimizing the AF over the search space containing discrete parameters, we instead maximize the expectation of the AF over a probability distribution defined by continuous parameters. We prove that under suitable reparameterizations, the BO policy that maximizes the probabilistic objective is the same as that which maximizes the AF, and therefore, PR enjoys the same regret bounds as the original BO policy using the underlying AF. Moreover, our approach provably converges to a stationary point of the probabilistic objective under gradient ascent using scalable, unbiased estimators of both the probabilistic objective and its gradient. Therefore, as the number of starting points and gradient steps increase, our approach will recover of a maximizer of the AF (an often-neglected requisite for commonly used BO regret bounds). We validate our approach empirically and demonstrate state-of-the-art optimization performance on a wide range of real-world applications. PR is complementary to (and benefits) recent work and naturally generalizes to settings with multiple objectives and black-box constraints.

* To appear in Advances in Neural Information Processing Systems 35, 2022. Code available at: https://github.com/facebookresearch/bo_pr

Via

Access Paper or Ask Questions

Robust Multi-Objective Bayesian Optimization Under Input Noise

Feb 16, 2022

Samuel Daulton, Sait Cakmak, Maximilian Balandat, Michael A. Osborne, Enlu Zhou, Eytan Bakshy

Figure 1 for Robust Multi-Objective Bayesian Optimization Under Input Noise

Figure 2 for Robust Multi-Objective Bayesian Optimization Under Input Noise

Figure 3 for Robust Multi-Objective Bayesian Optimization Under Input Noise

Figure 4 for Robust Multi-Objective Bayesian Optimization Under Input Noise

Abstract:Bayesian optimization (BO) is a sample-efficient approach for tuning design parameters to optimize expensive-to-evaluate, black-box performance metrics. In many manufacturing processes, the design parameters are subject to random input noise, resulting in a product that is often less performant than expected. Although BO methods have been proposed for optimizing a single objective under input noise, no existing method addresses the practical scenario where there are multiple objectives that are sensitive to input perturbations. In this work, we propose the first multi-objective BO method that is robust to input noise. We formalize our goal as optimizing the multivariate value-at-risk (MVaR), a risk measure of the uncertain objectives. Since directly optimizing MVaR is computationally infeasible in many settings, we propose a scalable, theoretically-grounded approach for optimizing MVaR using random scalarizations. Empirically, we find that our approach significantly outperforms alternative methods and efficiently identifies optimal robust designs that will satisfy specifications across multiple metrics with high probability.

* 41 pages. Code is available at https://github.com/facebookresearch/robust_mobo

Via

Access Paper or Ask Questions

Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

Nov 12, 2021

Raul Astudillo, Daniel R. Jiang, Maximilian Balandat, Eytan Bakshy, Peter I. Frazier

Figure 1 for Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

Figure 2 for Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

Figure 3 for Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

Figure 4 for Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

Abstract:Bayesian optimization (BO) is a sample-efficient approach to optimizing costly-to-evaluate black-box functions. Most BO methods ignore how evaluation costs may vary over the optimization domain. However, these costs can be highly heterogeneous and are often unknown in advance. This occurs in many practical settings, such as hyperparameter tuning of machine learning algorithms or physics-based simulation optimization. Moreover, those few existing methods that acknowledge cost heterogeneity do not naturally accommodate a budget constraint on the total evaluation cost. This combination of unknown costs and a budget constraint introduces a new dimension to the exploration-exploitation trade-off, where learning about the cost incurs the cost itself. Existing methods do not reason about the various trade-offs of this problem in a principled way, leading often to poor performance. We formalize this claim by proving that the expected improvement and the expected improvement per unit of cost, arguably the two most widely used acquisition functions in practice, can be arbitrarily inferior with respect to the optimal non-myopic policy. To overcome the shortcomings of existing approaches, we propose the budgeted multi-step expected improvement, a non-myopic acquisition function that generalizes classical expected improvement to the setting of heterogeneous and unknown evaluation costs. Finally, we show that our acquisition function outperforms existing methods in a variety of synthetic and real problems.

* In Advances in Neural Information Processing Systems, 2021

Via

Access Paper or Ask Questions

Sustainable AI: Environmental Implications, Challenges and Opportunities

Oct 30, 2021

Carole-Jean Wu, Ramya Raghavendra, Udit Gupta, Bilge Acun, Newsha Ardalani, Kiwan Maeng, Gloria Chang, Fiona Aga Behram, James Huang, Charles Bai(+15 more)

Figure 1 for Sustainable AI: Environmental Implications, Challenges and Opportunities

Figure 2 for Sustainable AI: Environmental Implications, Challenges and Opportunities

Figure 3 for Sustainable AI: Environmental Implications, Challenges and Opportunities

Figure 4 for Sustainable AI: Environmental Implications, Challenges and Opportunities

Abstract:This paper explores the environmental impact of the super-linear growth trends for AI from a holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the carbon footprint of AI computing by examining the model development cycle across industry-scale machine learning use cases and, at the same time, considering the life cycle of system hardware. Taking a step further, we capture the operational and manufacturing carbon footprint of AI computing and present an end-to-end analysis for what and how hardware-software design and at-scale optimization can help reduce the overall carbon footprint of AI. Based on the industry experience and lessons learned, we share the key challenges and chart out important development directions across the many dimensions of AI. We hope the key messages and insights presented in this paper can inspire the community to advance the field of AI in an environmentally-responsible manner.

Via

Access Paper or Ask Questions

Multi-Objective Bayesian Optimization over High-Dimensional Search Spaces

Sep 22, 2021

Samuel Daulton, David Eriksson, Maximilian Balandat, Eytan Bakshy

Figure 1 for Multi-Objective Bayesian Optimization over High-Dimensional Search Spaces

Figure 2 for Multi-Objective Bayesian Optimization over High-Dimensional Search Spaces

Figure 3 for Multi-Objective Bayesian Optimization over High-Dimensional Search Spaces

Figure 4 for Multi-Objective Bayesian Optimization over High-Dimensional Search Spaces

Abstract:The ability to optimize multiple competing objective functions with high sample efficiency is imperative in many applied problems across science and industry. Multi-objective Bayesian optimization (BO) achieves strong empirical performance on such problems, but even with recent methodological advances, it has been restricted to simple, low-dimensional domains. Most existing BO methods exhibit poor performance on search spaces with more than a few dozen parameters. In this work we propose MORBO, a method for multi-objective Bayesian optimization over high-dimensional search spaces. MORBO performs local Bayesian optimization within multiple trust regions simultaneously, allowing it to explore and identify diverse solutions even when the objective functions are difficult to model globally. We show that MORBO significantly advances the state-of-the-art in sample-efficiency for several high-dimensional synthetic and real-world multi-objective problems, including a vehicle design problem with 222 parameters, demonstrating that MORBO is a practical approach for challenging and important problems that were previously out of reach for BO methods.

Via

Access Paper or Ask Questions

Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

Jun 25, 2021

David Eriksson, Pierce I-Jen Chuang, Samuel Daulton, Peng Xia, Akshat Shrivastava, Arun Babu, Shicong Zhao, Ahmed Aly, Ganesh Venkatesh, Maximilian Balandat

Figure 1 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

Figure 2 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

Figure 3 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

Figure 4 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

Abstract:When tuning the architecture and hyperparameters of large machine learning models for on-device deployment, it is desirable to understand the optimal trade-offs between on-device latency and model accuracy. In this work, we leverage recent methodological advances in Bayesian optimization over high-dimensional search spaces and multi-objective Bayesian optimization to efficiently explore these trade-offs for a production-scale on-device natural language understanding model at Facebook.

* To Appear at the 8th ICML Workshop on Automated Machine Learning, ICML 2021

Via

Access Paper or Ask Questions

Bayesian Optimization with High-Dimensional Outputs

Jun 24, 2021

Wesley J. Maddox, Maximilian Balandat, Andrew Gordon Wilson, Eytan Bakshy

Figure 1 for Bayesian Optimization with High-Dimensional Outputs

Figure 2 for Bayesian Optimization with High-Dimensional Outputs

Figure 3 for Bayesian Optimization with High-Dimensional Outputs

Figure 4 for Bayesian Optimization with High-Dimensional Outputs

Abstract:Bayesian Optimization is a sample-efficient black-box optimization procedure that is typically applied to problems with a small number of independent objectives. However, in practice we often wish to optimize objectives defined over many correlated outcomes (or ``tasks"). For example, scientists may want to optimize the coverage of a cell tower network across a dense grid of locations. Similarly, engineers may seek to balance the performance of a robot across dozens of different environments via constrained or robust optimization. However, the Gaussian Process (GP) models typically used as probabilistic surrogates for multi-task Bayesian Optimization scale poorly with the number of outcomes, greatly limiting applicability. We devise an efficient technique for exact multi-task GP sampling that combines exploiting Kronecker structure in the covariance matrices with Matheron's identity, allowing us to perform Bayesian Optimization using exact multi-task GP models with tens of thousands of correlated outputs. In doing so, we achieve substantial improvements in sample efficiency compared to existing approaches that only model aggregate functions of the outcomes. We demonstrate how this unlocks a new class of applications for Bayesian Optimization across a range of tasks in science and engineering, including optimizing interference patterns of an optical interferometer with more than 65,000 outputs.

Via

Access Paper or Ask Questions

Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement

May 17, 2021

Samuel Daulton, Maximilian Balandat, Eytan Bakshy

Figure 1 for Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement

Figure 2 for Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement

Figure 3 for Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement

Figure 4 for Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement

Abstract:Optimizing multiple competing black-box objectives is a challenging problem in many fields, including science, engineering, and machine learning. Multi-objective Bayesian optimization is a powerful approach for identifying the optimal trade-offs between the objectives with very few function evaluations. However, existing methods tend to perform poorly when observations are corrupted by noise, as they do not take into account uncertainty in the true Pareto frontier over the previously evaluated designs. We propose a novel acquisition function, NEHVI, that overcomes this important practical limitation by applying a Bayesian treatment to the popular expected hypervolume improvement criterion to integrate over this uncertainty in the Pareto frontier. We further argue that, even in the noiseless setting, the problem of generating multiple candidates in parallel reduces that of handling uncertainty in the Pareto frontier. Through this lens, we derive a natural parallel variant of NEHVI that can efficiently generate large batches of candidates. We provide a theoretical convergence guarantee for optimizing a Monte Carlo estimator of NEHVI using exact sample-path gradients. Empirically, we show that NEHVI achieves state-of-the-art performance in noisy and large-batch environments.

Via

Access Paper or Ask Questions

Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

Jun 29, 2020

Shali Jiang, Daniel R. Jiang, Maximilian Balandat, Brian Karrer, Jacob R. Gardner, Roman Garnett

Figure 1 for Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

Figure 2 for Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

Figure 3 for Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

Figure 4 for Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

Abstract:Bayesian optimization is a sequential decision making framework for optimizing expensive-to-evaluate black-box functions. Computing a full lookahead policy amounts to solving a highly intractable stochastic dynamic program. Myopic approaches, such as expected improvement, are often adopted in practice, but they ignore the long-term impact of the immediate decision. Existing nonmyopic approaches are mostly heuristic and/or computationally expensive. In this paper, we provide the first efficient implementation of general multi-step lookahead Bayesian optimization, formulated as a sequence of nested optimization problems within a multi-step scenario tree. Instead of solving these problems in a nested way, we equivalently optimize all decision variables in the full tree jointly, in a ``one-shot'' fashion. Combining this with an efficient method for implementing multi-step Gaussian process ``fantasization,'' we demonstrate that multi-step expected improvement is computationally tractable and exhibits performance superior to existing methods on a wide range of benchmarks.

Via

Access Paper or Ask Questions