Alert button
Picture for Michael U. Gutmann

Michael U. Gutmann

Alert button

Conditional Sampling of Variational Autoencoders via Iterated Approximate Ancestral Sampling

Aug 17, 2023
Vaidotas Simkus, Michael U. Gutmann

Figure 1 for Conditional Sampling of Variational Autoencoders via Iterated Approximate Ancestral Sampling
Figure 2 for Conditional Sampling of Variational Autoencoders via Iterated Approximate Ancestral Sampling
Figure 3 for Conditional Sampling of Variational Autoencoders via Iterated Approximate Ancestral Sampling
Figure 4 for Conditional Sampling of Variational Autoencoders via Iterated Approximate Ancestral Sampling

Conditional sampling of variational autoencoders (VAEs) is needed in various applications, such as missing data imputation, but is computationally intractable. A principled choice for asymptotically exact conditional sampling is Metropolis-within-Gibbs (MWG). However, we observe that the tendency of VAEs to learn a structured latent space, a commonly desired property, can cause the MWG sampler to get "stuck" far from the target distribution. This paper mitigates the limitations of MWG: we systematically outline the pitfalls in the context of VAEs, propose two original methods that address these pitfalls, and demonstrate an improved performance of the proposed methods on a set of sampling tasks.

Viaarxiv icon

Designing Optimal Behavioral Experiments Using Machine Learning

May 12, 2023
Simon Valentin, Steven Kleinegesse, Neil R. Bramley, Peggy Seriès, Michael U. Gutmann, Christopher G. Lucas

Figure 1 for Designing Optimal Behavioral Experiments Using Machine Learning
Figure 2 for Designing Optimal Behavioral Experiments Using Machine Learning
Figure 3 for Designing Optimal Behavioral Experiments Using Machine Learning
Figure 4 for Designing Optimal Behavioral Experiments Using Machine Learning

Computational models are powerful tools for understanding human cognition and behavior. They let us express our theories clearly and precisely, and offer predictions that can be subtle and often counter-intuitive. However, this same richness and ability to surprise means our scientific intuitions and traditional tools are ill-suited to designing experiments to test and compare these models. To avoid these pitfalls and realize the full potential of computational modeling, we require tools to design experiments that provide clear answers about what models explain human behavior and the auxiliary assumptions those models must make. Bayesian optimal experimental design (BOED) formalizes the search for optimal experimental designs by identifying experiments that are expected to yield informative data. In this work, we provide a tutorial on leveraging recent advances in BOED and machine learning to find optimal experiments for any kind of model that we can simulate data from, and show how by-products of this procedure allow for quick and straightforward evaluation of models and their parameters against real experimental data. As a case study, we consider theories of how people balance exploration and exploitation in multi-armed bandit decision-making tasks. We validate the presented approach using simulations and a real-world experiment. As compared to experimental designs commonly used in the literature, we show that our optimal designs more efficiently determine which of a set of models best account for individual human behavior, and more efficiently characterize behavior given a preferred model. We provide code to replicate all analyses as well as tutorial notebooks and pointers to adapt the methodology to other experimental settings.

* Under review 
Viaarxiv icon

Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression

May 01, 2023
Akash Srivastava, Seungwook Han, Kai Xu, Benjamin Rhodes, Michael U. Gutmann

Figure 1 for Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression
Figure 2 for Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression
Figure 3 for Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression
Figure 4 for Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression

Functions of the ratio of the densities $p/q$ are widely used in machine learning to quantify the discrepancy between the two distributions $p$ and $q$. For high-dimensional distributions, binary classification-based density ratio estimators have shown great promise. However, when densities are well separated, estimating the density ratio with a binary classifier is challenging. In this work, we show that the state-of-the-art density ratio estimators perform poorly on well-separated cases and demonstrate that this is due to distribution shifts between training and evaluation time. We present an alternative method that leverages multi-class classification for density ratio estimation and does not suffer from distribution shift issues. The method uses a set of auxiliary densities $\{m_k\}_{k=1}^K$ and trains a multi-class logistic regression to classify the samples from $p, q$, and $\{m_k\}_{k=1}^K$ into $K+2$ classes. We show that if these auxiliary densities are constructed such that they overlap with $p$ and $q$, then a multi-class logistic regression allows for estimating $\log p/q$ on the domain of any of the $K+2$ distributions and resolves the distribution shift problems of the current state-of-the-art methods. We compare our method to state-of-the-art density ratio estimators on both synthetic and real datasets and demonstrate its superior performance on the tasks of density ratio estimation, mutual information estimation, and representation learning. Code: https://www.blackswhan.com/mdre/

* TMLR 2023  
Viaarxiv icon

Bayesian Optimization with Informative Covariance

Aug 04, 2022
Afonso Eduardo, Michael U. Gutmann

Figure 1 for Bayesian Optimization with Informative Covariance
Figure 2 for Bayesian Optimization with Informative Covariance
Figure 3 for Bayesian Optimization with Informative Covariance
Figure 4 for Bayesian Optimization with Informative Covariance

Bayesian Optimization is a methodology for global optimization of unknown and expensive objectives. It combines a surrogate Bayesian regression model with an acquisition function to decide where to evaluate the objective. Typical regression models are Gaussian processes with stationary covariance functions, which, however, are unable to express prior input-dependent information, in particular information about possible locations of the optimum. The ubiquity of stationary models has led to the common practice of exploiting prior information via informative mean functions. In this paper, we highlight that these models can lead to poor performance, especially in high dimensions. We propose novel informative covariance functions that leverage nonstationarity to encode preferences for certain regions of the search space and adaptively promote local exploration during the optimization. We demonstrate that they can increase the sample efficiency of the optimization in high dimensions, even under weak prior information.

Viaarxiv icon

Pen and Paper Exercises in Machine Learning

Jun 27, 2022
Michael U. Gutmann

Figure 1 for Pen and Paper Exercises in Machine Learning
Figure 2 for Pen and Paper Exercises in Machine Learning
Figure 3 for Pen and Paper Exercises in Machine Learning
Figure 4 for Pen and Paper Exercises in Machine Learning

This is a collection of (mostly) pen-and-paper exercises in machine learning. The exercises are on the following topics: linear algebra, optimisation, directed graphical models, undirected graphical models, expressive power of graphical models, factor graphs and message passing, inference for hidden Markov models, model-based learning (including ICA and unnormalised models), sampling and Monte-Carlo integration, and variational inference.

* The associated github page is https://github.com/michaelgutmann/ml-pen-and-paper-exercises 
Viaarxiv icon

Statistical applications of contrastive learning

Apr 29, 2022
Michael U. Gutmann, Steven Kleinegesse, Benjamin Rhodes

Figure 1 for Statistical applications of contrastive learning
Figure 2 for Statistical applications of contrastive learning
Figure 3 for Statistical applications of contrastive learning
Figure 4 for Statistical applications of contrastive learning

The likelihood function plays a crucial role in statistical inference and experimental design. However, it is computationally intractable for several important classes of statistical models, including energy-based models and simulator-based models. Contrastive learning is an intuitive and computationally feasible alternative to likelihood-based learning. We here first provide an introduction to contrastive learning and then show how we can use it to derive methods for diverse statistical problems, namely parameter estimation for energy-based models, Bayesian inference for simulator-based models, as well as experimental design.

* Accepted to Behaviormetrika 
Viaarxiv icon

Variational Gibbs inference for statistical model estimation from incomplete data

Nov 25, 2021
Vaidotas Simkus, Benjamin Rhodes, Michael U. Gutmann

Figure 1 for Variational Gibbs inference for statistical model estimation from incomplete data
Figure 2 for Variational Gibbs inference for statistical model estimation from incomplete data
Figure 3 for Variational Gibbs inference for statistical model estimation from incomplete data
Figure 4 for Variational Gibbs inference for statistical model estimation from incomplete data

Statistical models are central to machine learning with broad applicability across a range of downstream tasks. The models are typically controlled by free parameters that are estimated from data by maximum-likelihood estimation. However, when faced with real-world datasets many of the models run into a critical issue: they are formulated in terms of fully-observed data, whereas in practice the datasets are plagued with missing data. The theory of statistical model estimation from incomplete data is conceptually similar to the estimation of latent-variable models, where powerful tools such as variational inference (VI) exist. However, in contrast to standard latent-variable models, parameter estimation with incomplete data often requires estimating exponentially-many conditional distributions of the missing variables, hence making standard VI methods intractable. We address this gap by introducing variational Gibbs inference (VGI), a new general-purpose method to estimate the parameters of statistical models from incomplete data. We validate VGI on a set of synthetic and real-world estimation tasks, estimating important machine learning models, VAEs and normalising flows, from incomplete data. The proposed method, whilst general-purpose, achieves competitive or better performance than existing model-specific estimation methods.

Viaarxiv icon

Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods

Nov 03, 2021
Desi R. Ivanova, Adam Foster, Steven Kleinegesse, Michael U. Gutmann, Tom Rainforth

Figure 1 for Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods
Figure 2 for Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods
Figure 3 for Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods
Figure 4 for Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods

We introduce implicit Deep Adaptive Design (iDAD), a new method for performing adaptive experiments in real-time with implicit models. iDAD amortizes the cost of Bayesian optimal experimental design (BOED) by learning a design policy network upfront, which can then be deployed quickly at the time of the experiment. The iDAD network can be trained on any model which simulates differentiable samples, unlike previous design policy work that requires a closed form likelihood and conditionally independent experiments. At deployment, iDAD allows design decisions to be made in milliseconds, in contrast to traditional BOED approaches that require heavy computation during the experiment itself. We illustrate the applicability of iDAD on a number of experiments, and show that it provides a fast and effective mechanism for performing adaptive design with implicit models.

* 33 pages, 8 figures. Published as a conference paper at NeurIPS 2021 
Viaarxiv icon

Bayesian Optimal Experimental Design for Simulator Models of Cognition

Oct 29, 2021
Simon Valentin, Steven Kleinegesse, Neil R. Bramley, Michael U. Gutmann, Christopher G. Lucas

Figure 1 for Bayesian Optimal Experimental Design for Simulator Models of Cognition
Figure 2 for Bayesian Optimal Experimental Design for Simulator Models of Cognition
Figure 3 for Bayesian Optimal Experimental Design for Simulator Models of Cognition
Figure 4 for Bayesian Optimal Experimental Design for Simulator Models of Cognition

Bayesian optimal experimental design (BOED) is a methodology to identify experiments that are expected to yield informative data. Recent work in cognitive science considered BOED for computational models of human behavior with tractable and known likelihood functions. However, tractability often comes at the cost of realism; simulator models that can capture the richness of human behavior are often intractable. In this work, we combine recent advances in BOED and approximate inference for intractable models, using machine-learning methods to find optimal experimental designs, approximate sufficient summary statistics and amortized posterior distributions. Our simulation experiments on multi-armed bandit tasks show that our method results in improved model discrimination and parameter estimation, as compared to experimental designs commonly used in the literature.

* Accepted as a poster at the NeurIPS 2021 Workshop "AI for Science" 
Viaarxiv icon

Gradient-based Bayesian Experimental Design for Implicit Models using Mutual Information Lower Bounds

May 10, 2021
Steven Kleinegesse, Michael U. Gutmann

Figure 1 for Gradient-based Bayesian Experimental Design for Implicit Models using Mutual Information Lower Bounds
Figure 2 for Gradient-based Bayesian Experimental Design for Implicit Models using Mutual Information Lower Bounds
Figure 3 for Gradient-based Bayesian Experimental Design for Implicit Models using Mutual Information Lower Bounds
Figure 4 for Gradient-based Bayesian Experimental Design for Implicit Models using Mutual Information Lower Bounds

We introduce a framework for Bayesian experimental design (BED) with implicit models, where the data-generating distribution is intractable but sampling from it is still possible. In order to find optimal experimental designs for such models, our approach maximises mutual information lower bounds that are parametrised by neural networks. By training a neural network on sampled data, we simultaneously update network parameters and designs using stochastic gradient-ascent. The framework enables experimental design with a variety of prominent lower bounds and can be applied to a wide range of scientific tasks, such as parameter estimation, model discrimination and improving future predictions. Using a set of intractable toy models, we provide a comprehensive empirical comparison of prominent lower bounds applied to the aforementioned tasks. We further validate our framework on a challenging system of stochastic differential equations from epidemiology.

* Under review 
Viaarxiv icon