Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Samuel Daulton

Thompson Sampling for Contextual Bandit Problems with Auxiliary Safety Constraints

Nov 02, 2019

Samuel Daulton, Shaun Singh, Vashist Avadhanula, Drew Dimmery, Eytan Bakshy

Figure 1 for Thompson Sampling for Contextual Bandit Problems with Auxiliary Safety Constraints

Figure 2 for Thompson Sampling for Contextual Bandit Problems with Auxiliary Safety Constraints

Figure 3 for Thompson Sampling for Contextual Bandit Problems with Auxiliary Safety Constraints

Abstract:Recent advances in contextual bandit optimization and reinforcement learning have garnered interest in applying these methods to real-world sequential decision making problems. Real-world applications frequently have constraints with respect to a currently deployed policy. Many of the existing constraint-aware algorithms consider problems with a single objective (the reward) and a constraint on the reward with respect to a baseline policy. However, many important applications involve multiple competing objectives and auxiliary constraints. In this paper, we propose a novel Thompson sampling algorithm for multi-outcome contextual bandit problems with auxiliary constraints. We empirically evaluate our algorithm on a synthetic problem. Lastly, we apply our method to a real world video transcoding problem and provide a practical way for navigating the trade-off between safety and performance using Bayesian optimization.

* To appear at NeurIPS 2019, Workshop on Safety and Robustness in Decision Making. 11 pages (including references and appendix)

Via

Access Paper or Ask Questions

BoTorch: Programmable Bayesian Optimization in PyTorch

Oct 14, 2019

Maximilian Balandat, Brian Karrer, Daniel R. Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, Eytan Bakshy

Figure 1 for BoTorch: Programmable Bayesian Optimization in PyTorch

Figure 2 for BoTorch: Programmable Bayesian Optimization in PyTorch

Figure 3 for BoTorch: Programmable Bayesian Optimization in PyTorch

Figure 4 for BoTorch: Programmable Bayesian Optimization in PyTorch

Abstract:Bayesian optimization provides sample-efficient global optimization for a broad range of applications, including automatic machine learning, molecular chemistry, and experimental design. We introduce BoTorch, a modern programming framework for Bayesian optimization. Enabled by Monte-Carlo (MC) acquisition functions and auto-differentiation, BoTorch's modular design facilitates flexible specification and optimization of probabilistic models written in PyTorch, radically simplifying implementation of novel acquisition functions. Our MC approach is made practical by a distinctive algorithmic foundation that leverages fast predictive distributions and hardware acceleration. In experiments, we demonstrate the improved sample efficiency of BoTorch relative to other popular libraries. BoTorch is open source and available at https://github.com/pytorch/botorch.

Via

Access Paper or Ask Questions

Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Oct 31, 2017

Taylor Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez

Figure 1 for Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Figure 2 for Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Figure 3 for Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Figure 4 for Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Abstract:We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings. Our new framework correctly models the joint uncertainty in the latent parameters and the state space. We also replace the original Gaussian Process-based model with a Bayesian Neural Network, enabling more scalable inference. Thus, we expand the scope of the HiP-MDP to applications with higher dimensions and more complex dynamics.

* To appear at NIPS 2017, selected for an oral presentation. 17 pages (incl references and appendix). Example code can be found at http://github.com/dtak/hip-mdp-public

Via

Access Paper or Ask Questions