Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dino Sejdinovic

Bayesian Adaptive Calibration and Optimal Design

May 23, 2024

Rafael Oliveira, Dino Sejdinovic, David Howard, Edwin Bonilla

Figure 1 for Bayesian Adaptive Calibration and Optimal Design

Figure 2 for Bayesian Adaptive Calibration and Optimal Design

Figure 3 for Bayesian Adaptive Calibration and Optimal Design

Figure 4 for Bayesian Adaptive Calibration and Optimal Design

Abstract:The process of calibrating computer models of natural phenomena is essential for applications in the physical sciences, where plenty of domain knowledge can be embedded into simulations and then calibrated against real observations. Current machine learning approaches, however, mostly rely on rerunning simulations over a fixed set of designs available in the observed data, potentially neglecting informative correlations across the design space and requiring a large amount of simulations. Instead, we consider the calibration process from the perspective of Bayesian adaptive experimental design and propose a data-efficient algorithm to run maximally informative simulations within a batch-sequential process. At each round, the algorithm jointly estimates the parameters of the posterior distribution and optimal designs by maximising a variational lower bound of the expected information gain. The simulator is modelled as a sample from a Gaussian process, which allows us to correlate simulations and observed data with the unknown calibration parameters. We show the benefits of our method when compared to related approaches across synthetic and real-data problems.

* Preprint, currently under review

Via

Access Paper or Ask Questions

Neural-Kernel Conditional Mean Embeddings

Mar 16, 2024

Eiki Shimizu, Kenji Fukumizu, Dino Sejdinovic

Abstract:Kernel conditional mean embeddings (CMEs) offer a powerful framework for representing conditional distribution, but they often face scalability and expressiveness challenges. In this work, we propose a new method that effectively combines the strengths of deep learning with CMEs in order to address these challenges. Specifically, our approach leverages the end-to-end neural network (NN) optimization framework using a kernel-based objective. This design circumvents the computationally expensive Gram matrix inversion required by current CME methods. To further enhance performance, we provide efficient strategies to optimize the remaining kernel hyperparameters. In conditional density estimation tasks, our NN-CME hybrid achieves competitive performance and often surpasses existing deep learning-based methods. Lastly, we showcase its remarkable versatility by seamlessly integrating it into reinforcement learning (RL) contexts. Building on Q-learning, our approach naturally leads to a new variant of distributional RL methods, which demonstrates consistent effectiveness across different environments.

Via

Access Paper or Ask Questions

Exact, Fast and Expressive Poisson Point Processes via Squared Neural Families

Feb 14, 2024

Russell Tsuchida, Cheng Soon Ong, Dino Sejdinovic

Abstract:We introduce squared neural Poisson point processes (SNEPPPs) by parameterising the intensity function by the squared norm of a two layer neural network. When the hidden layer is fixed and the second layer has a single neuron, our approach resembles previous uses of squared Gaussian process or kernel methods, but allowing the hidden layer to be learnt allows for additional flexibility. In many cases of interest, the integrated intensity function admits a closed form and can be computed in quadratic time in the number of hidden neurons. We enumerate a far more extensive number of such cases than has previously been discussed. Our approach is more memory and time efficient than naive implementations of squared or exponentiated kernel methods or Gaussian processes. Maximum likelihood and maximum a posteriori estimates in a reparameterisation of the final layer of the intensity function can be obtained by solving a (strongly) convex optimisation problem using projected gradient descent. We demonstrate SNEPPPs on real, and synthetic benchmarks, and provide a software implementation. https://github.com/RussellTsuchida/snefy

* AAAI 2024 camera ready submission

Via

Access Paper or Ask Questions

FaIRGP: A Bayesian Energy Balance Model for Surface Temperatures Emulation

Jul 14, 2023

Shahine Bouabid, Dino Sejdinovic, Duncan Watson-Parris

Abstract:Emulators, or reduced complexity climate models, are surrogate Earth system models that produce projections of key climate quantities with minimal computational resources. Using time-series modeling or more advanced machine learning techniques, data-driven emulators have emerged as a promising avenue of research, producing spatially resolved climate responses that are visually indistinguishable from state-of-the-art Earth system models. Yet, their lack of physical interpretability limits their wider adoption. In this work, we introduce FaIRGP, a data-driven emulator that satisfies the physical temperature response equations of an energy balance model. The result is an emulator that (i) enjoys the flexibility of statistical machine learning models and can learn from observations, and (ii) has a robust physical grounding with interpretable parameters that can be used to make inference about the climate system. Further, our Bayesian approach allows a principled and mathematically tractable uncertainty quantification. Our model demonstrates skillful emulation of global mean surface temperature and spatial surface temperatures across realistic future scenarios. Its ability to learn from data allows it to outperform energy balance models, while its robust physical foundation safeguards against the pitfalls of purely data-driven models. We also illustrate how FaIRGP can be used to obtain estimates of top-of-atmosphere radiative forcing and discuss the benefits of its mathematical tractability for applications such as detection and attribution or precipitation emulation. We hope that this work will contribute to widening the adoption of data-driven methods in climate emulation.

Via

Access Paper or Ask Questions

Explaining the Uncertain: Stochastic Shapley Values for Gaussian Process Models

May 24, 2023

Siu Lun Chau, Krikamol Muandet, Dino Sejdinovic

Abstract:We present a novel approach for explaining Gaussian processes (GPs) that can utilize the full analytical covariance structure present in GPs. Our method is based on the popular solution concept of Shapley values extended to stochastic cooperative games, resulting in explanations that are random variables. The GP explanations generated using our approach satisfy similar favorable axioms to standard Shapley values and possess a tractable covariance function across features and data observations. This covariance allows for quantifying explanation uncertainties and studying the statistical dependencies between explanations. We further extend our framework to the problem of predictive explanation, and propose a Shapley prior over the explanation function to predict Shapley values for new data based on previously computed ones. Our extensive illustrations demonstrate the effectiveness of the proposed approach.

* 26 pages, 6 figures

Via

Access Paper or Ask Questions

A Rigorous Link between Deep Ensembles and Bayesian Methods

May 24, 2023

Veit David Wild, Sahra Ghalebikesabi, Dino Sejdinovic, Jeremias Knoblauch

Figure 1 for A Rigorous Link between Deep Ensembles and Bayesian Methods

Figure 2 for A Rigorous Link between Deep Ensembles and Bayesian Methods

Figure 3 for A Rigorous Link between Deep Ensembles and Bayesian Methods

Figure 4 for A Rigorous Link between Deep Ensembles and Bayesian Methods

Abstract:We establish the first mathematically rigorous link between Bayesian, variational Bayesian, and ensemble methods. A key step towards this it to reformulate the non-convex optimisation problem typically encountered in deep learning as a convex optimisation in the space of probability measures. On a technical level, our contribution amounts to studying generalised variational inference through the lense of Wasserstein gradient flows. The result is a unified theory of various seemingly disconnected approaches that are commonly used for uncertainty quantification in deep learning -- including deep ensembles and (variational) Bayesian methods. This offers a fresh perspective on the reasons behind the success of deep ensembles over procedures based on parameterised variational inference, and allows the derivation of new ensembling schemes with convergence guarantees. We showcase this by proposing a family of interacting deep ensembles with direct parallels to the interactions of particle systems in thermodynamics, and use our theory to prove the convergence of these algorithms to a well-defined global minimiser on the space of probability measures.

Via

Access Paper or Ask Questions

Squared Neural Families: A New Class of Tractable Density Models

May 22, 2023

Russell Tsuchida, Cheng Soon Ong, Dino Sejdinovic

Figure 1 for Squared Neural Families: A New Class of Tractable Density Models

Figure 2 for Squared Neural Families: A New Class of Tractable Density Models

Figure 3 for Squared Neural Families: A New Class of Tractable Density Models

Figure 4 for Squared Neural Families: A New Class of Tractable Density Models

Abstract:Flexible models for probability distributions are an essential ingredient in many machine learning tasks. We develop and investigate a new class of probability distributions, which we call a Squared Neural Family (SNEFY), formed by squaring the 2-norm of a neural network and normalising it with respect to a base measure. Following the reasoning similar to the well established connections between infinitely wide neural networks and Gaussian processes, we show that SNEFYs admit a closed form normalising constants in many cases of interest, thereby resulting in flexible yet fully tractable density models. SNEFYs strictly generalise classical exponential families, are closed under conditioning, and have tractable marginal distributions. Their utility is illustrated on a variety of density estimation and conditional density estimation tasks. Software available at https://github.com/RussellTsuchida/snefy.

* Preprint

Via

Access Paper or Ask Questions

Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Jan 26, 2023

Shahine Bouabid, Jake Fawkes, Dino Sejdinovic

Figure 1 for Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Figure 2 for Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Figure 3 for Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Figure 4 for Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Abstract:A directed acyclic graph (DAG) provides valuable prior knowledge that is often discarded in regression tasks in machine learning. We show that the independences arising from the presence of collider structures in DAGs provide meaningful inductive biases, which constrain the regression hypothesis space and improve predictive performance. We introduce collider regression, a framework to incorporate probabilistic causal knowledge from a collider in a regression problem. When the hypothesis space is a reproducing kernel Hilbert space, we prove a strictly positive generalisation benefit under mild assumptions and provide closed-form estimators of the empirical risk minimiser. Experiments on synthetic and climate model data demonstrate performance gains of the proposed methodology.

Via

Access Paper or Ask Questions

Doubly Robust Kernel Statistics for Testing Distributional Treatment Effects Even Under One Sided Overlap

Dec 09, 2022

Jake Fawkes, Robert Hu, Robin J. Evans, Dino Sejdinovic

Abstract:As causal inference becomes more widespread the importance of having good tools to test for causal effects increases. In this work we focus on the problem of testing for causal effects that manifest in a difference in distribution for treatment and control. We build on work applying kernel methods to causality, considering the previously introduced Counterfactual Mean Embedding framework (\textsc{CfME}). We improve on this by proposing the \emph{Doubly Robust Counterfactual Mean Embedding} (\textsc{DR-CfME}), which has better theoretical properties than its predecessor by leveraging semiparametric theory. This leads us to propose new kernel based test statistics for distributional effects which are based upon doubly robust estimators of treatment effects. We propose two test statistics, one which is a direct improvement on previous work and one which can be applied even when the support of the treatment arm is a subset of that of the control arm. We demonstrate the validity of our methods on simulated and real-world data, as well as giving an application in off-policy evaluation.

* 9 pages, Preprint

Via

Access Paper or Ask Questions

Bayesian Counterfactual Mean Embeddings and Off-Policy Evaluation

Nov 02, 2022

Diego Martinez-Taboada, Dino Sejdinovic

Abstract:The counterfactual distribution models the effect of the treatment in the untreated group. While most of the work focuses on the expected values of the treatment effect, one may be interested in the whole counterfactual distribution or other quantities associated to it. Building on the framework of Bayesian conditional mean embeddings, we propose a Bayesian approach for modeling the counterfactual distribution, which leads to quantifying the epistemic uncertainty about the distribution. The framework naturally extends to the setting where one observes multiple treatment effects (e.g. an intermediate effect after an interim period, and an ultimate treatment effect which is of main interest) and allows for additionally modelling uncertainty about the relationship of these effects. For such goal, we present three novel Bayesian methods to estimate the expectation of the ultimate treatment effect, when only noisy samples of the dependence between intermediate and ultimate effects are provided. These methods differ on the source of uncertainty considered and allow for combining two sources of data. Moreover, we generalize these ideas to the off-policy evaluation framework, which can be seen as an extension of the counterfactual estimation problem. We empirically explore the calibration of the algorithms in two different experimental settings which require data fusion, and illustrate the value of considering the uncertainty stemming from the two sources of data.

Via

Access Paper or Ask Questions