Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Javier Antorán

Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

Jun 17, 2022

Javier Antorán, David Janz, James Urquhart Allingham, Erik Daxberger, Riccardo Barbano, Eric Nalisnick, José Miguel Hernández-Lobato

Figure 1 for Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

Figure 2 for Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

Figure 3 for Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

Figure 4 for Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

Abstract:The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. We show that these interact poorly with some now-standard tools of deep learning--stochastic approximation methods and normalisation layers--and make recommendations for how to better adapt this classic method to the modern setting. We provide theoretical support for our recommendations and validate them empirically on MLPs, classic CNNs, residual networks with and without normalisation layers, generative autoencoders and transformers.

* Paper appearing at ICML 2022

Via

Access Paper or Ask Questions

A Probabilistic Deep Image Prior for Computational Tomography

Feb 28, 2022

Javier Antorán, Riccardo Barbano, Johannes Leuschner, José Miguel Hernández-Lobato, Bangti Jin

Figure 1 for A Probabilistic Deep Image Prior for Computational Tomography

Figure 2 for A Probabilistic Deep Image Prior for Computational Tomography

Figure 3 for A Probabilistic Deep Image Prior for Computational Tomography

Figure 4 for A Probabilistic Deep Image Prior for Computational Tomography

Abstract:Existing deep-learning based tomographic image reconstruction methods do not provide accurate estimates of reconstruction uncertainty, hindering their real-world deployment. To address this limitation, we construct a Bayesian prior for tomographic reconstruction, which combines the classical total variation (TV) regulariser with the modern deep image prior (DIP). Specifically, we use a change of variables to connect our prior beliefs on the image TV semi-norm with the hyper-parameters of the DIP network. For the inference, we develop an approach based on the linearised Laplace method, which is scalable to high-dimensional settings. The resulting framework provides pixel-wise uncertainty estimates and a marginal likelihood objective for hyperparameter optimisation. We demonstrate the method on synthetic and real-measured high-resolution $\mu$CT data, and show that it provides superior calibration of uncertainty estimates relative to previous probabilistic formulations of the DIP.

Via

Access Paper or Ask Questions

Addressing Bias in Active Learning with Depth Uncertainty Networks or Not

Dec 13, 2021

Chelsea Murray, James U. Allingham, Javier Antorán, José Miguel Hernández-Lobato

Figure 1 for Addressing Bias in Active Learning with Depth Uncertainty Networks or Not

Figure 2 for Addressing Bias in Active Learning with Depth Uncertainty Networks or Not

Figure 3 for Addressing Bias in Active Learning with Depth Uncertainty Networks or Not

Figure 4 for Addressing Bias in Active Learning with Depth Uncertainty Networks or Not

Abstract:Farquhar et al. [2021] show that correcting for active learning bias with underparameterised models leads to improved downstream performance. For overparameterised models such as NNs, however, correction leads either to decreased or unchanged performance. They suggest that this is due to an "overfitting bias" which offsets the active learning bias. We show that depth uncertainty networks operate in a low overfitting regime, much like underparameterised models. They should therefore see an increase in performance with bias correction. Surprisingly, they do not. We propose that this negative result, as well as the results Farquhar et al. [2021], can be explained via the lens of the bias-variance decomposition of generalisation error.

* arXiv admin note: substantial text overlap with arXiv:2112.06796

Via

Access Paper or Ask Questions

Depth Uncertainty Networks for Active Learning

Dec 13, 2021

Chelsea Murray, James U. Allingham, Javier Antorán, José Miguel Hernández-Lobato

Figure 1 for Depth Uncertainty Networks for Active Learning

Figure 2 for Depth Uncertainty Networks for Active Learning

Figure 3 for Depth Uncertainty Networks for Active Learning

Figure 4 for Depth Uncertainty Networks for Active Learning

Abstract:In active learning, the size and complexity of the training dataset changes over time. Simple models that are well specified by the amount of data available at the start of active learning might suffer from bias as more points are actively sampled. Flexible models that might be well suited to the full dataset can suffer from overfitting towards the start of active learning. We tackle this problem using Depth Uncertainty Networks (DUNs), a BNN variant in which the depth of the network, and thus its complexity, is inferred. We find that DUNs outperform other BNN variants on several active learning tasks. Importantly, we show that on the tasks in which DUNs perform best they present notably less overfitting than baselines.

Via

Access Paper or Ask Questions

Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty

Nov 15, 2020

Umang Bhatt, Yunfeng Zhang, Javier Antorán, Q. Vera Liao, Prasanna Sattigeri, Riccardo Fogliato, Gabrielle Gauthier Melançon, Ranganath Krishnan, Jason Stanley, Omesh Tickoo(+4 more)

Figure 1 for Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty

Figure 2 for Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty

Figure 3 for Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty

Figure 4 for Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty

Abstract:Transparency of algorithmic systems entails exposing system properties to various stakeholders for purposes that include understanding, improving, and/or contesting predictions. The machine learning (ML) community has mostly considered explainability as a proxy for transparency. With this work, we seek to encourage researchers to study uncertainty as a form of transparency and practitioners to communicate uncertainty estimates to stakeholders. First, we discuss methods for assessing uncertainty. Then, we describe the utility of uncertainty for mitigating model unfairness, augmenting decision-making, and building trustworthy systems. We also review methods for displaying uncertainty to stakeholders and discuss how to collect information required for incorporating uncertainty into existing ML pipelines. Our contribution is an interdisciplinary review to inform how to measure, communicate, and use uncertainty as a form of transparency.

* 19 pages, 6 figures

Via

Access Paper or Ask Questions

Expressive yet Tractable Bayesian Deep Learning via Subnetwork Inference

Oct 28, 2020

Erik Daxberger, Eric Nalisnick, James Urquhart Allingham, Javier Antorán, José Miguel Hernández-Lobato

Figure 1 for Expressive yet Tractable Bayesian Deep Learning via Subnetwork Inference

Figure 2 for Expressive yet Tractable Bayesian Deep Learning via Subnetwork Inference

Figure 3 for Expressive yet Tractable Bayesian Deep Learning via Subnetwork Inference

Figure 4 for Expressive yet Tractable Bayesian Deep Learning via Subnetwork Inference

Abstract:The Bayesian paradigm has the potential to solve some of the core issues in modern deep learning, such as poor calibration, data inefficiency, and catastrophic forgetting. However, scaling Bayesian inference to the high-dimensional parameter spaces of deep neural networks requires restrictive approximations. In this paper, we propose performing inference over only a small subset of the model parameters while keeping all others as point estimates. This enables us to use expressive posterior approximations that would otherwise be intractable for the full model. In particular, we develop a practical and scalable Bayesian deep learning method that first trains a point estimate, and then infers a full covariance Gaussian posterior approximation over a subnetwork. We propose a subnetwork selection procedure which aims to optimally preserve posterior uncertainty. We empirically demonstrate the effectiveness of our approach compared to point-estimated networks and methods that use less expressive posterior approximations over the full network.

* 15 pages, extended version with supplementary material

Via

Access Paper or Ask Questions

Depth Uncertainty in Neural Networks

Jun 15, 2020

Javier Antorán, James Urquhart Allingham, José Miguel Hernández-Lobato

Figure 1 for Depth Uncertainty in Neural Networks

Figure 2 for Depth Uncertainty in Neural Networks

Figure 3 for Depth Uncertainty in Neural Networks

Figure 4 for Depth Uncertainty in Neural Networks

Abstract:Existing methods for estimating uncertainty in deep learning tend to require multiple forward passes, making them unsuitable for applications where computational resources are limited. To solve this, we perform probabilistic reasoning over the depth of neural networks. Different depths correspond to subnetworks which share weights and whose predictions are combined via marginalisation, yielding model uncertainty. By exploiting the sequential structure of feed-forward networks, we are able to both evaluate our training objective and make predictions with a single forward pass. We validate our approach on real-world regression and image classification tasks. Our approach provides uncertainty calibration, robustness to dataset shift, and accuracies competitive with more computationally expensive baselines.

Via

Access Paper or Ask Questions

Getting a CLUE: A Method for Explaining Uncertainty Estimates

Jun 11, 2020

Javier Antorán, Umang Bhatt, Tameem Adel, Adrian Weller, José Miguel Hernández-Lobato

Figure 1 for Getting a CLUE: A Method for Explaining Uncertainty Estimates

Figure 2 for Getting a CLUE: A Method for Explaining Uncertainty Estimates

Figure 3 for Getting a CLUE: A Method for Explaining Uncertainty Estimates

Figure 4 for Getting a CLUE: A Method for Explaining Uncertainty Estimates

Abstract:Both uncertainty estimation and interpretability are important factors for trustworthy machine learning systems. However, there is little work at the intersection of these two areas. We address this gap by proposing a novel method for interpreting uncertainty estimates from differentiable probabilistic models, like Bayesian Neural Networks (BNNs). Our method, Counterfactual Latent Uncertainty Explanations (CLUE), indicates how to change an input, while keeping it on the data manifold, such that a BNN becomes more confident about the input's prediction. We validate CLUE through 1) a novel framework for evaluating counterfactual explanations of uncertainty, 2) a series of ablation experiments, and 3) a user study. Our experiments show that CLUE outperforms baselines and enables practitioners to better understand which input patterns are responsible for predictive uncertainty.

* 33 pages, 31 figures

Via

Access Paper or Ask Questions

Variational Depth Search in ResNets

Feb 27, 2020

Javier Antorán, James Urquhart Allingham, José Miguel Hernández-Lobato

Figure 1 for Variational Depth Search in ResNets

Figure 2 for Variational Depth Search in ResNets

Figure 3 for Variational Depth Search in ResNets

Figure 4 for Variational Depth Search in ResNets

Abstract:One-shot neural architecture search allows joint learning of weights and network architecture, reducing computational cost. We limit our search space to the depth of residual networks and formulate an analytically tractable variational objective that allows for obtaining an unbiased approximate posterior over depths in one-shot. We propose a heuristic to prune our networks based on this distribution. We compare our proposed method against manual search over network depths on the MNIST, Fashion-MNIST, SVHN datasets. We find that pruned networks do not incur a loss in predictive performance, obtaining accuracies competitive with unpruned networks. Marginalising over depth allows us to obtain better-calibrated test-time uncertainty estimates than regular networks, in a single forward pass.

* Appearing at the 1st ICLR workshop on Neural Architecture Search 2020

Via

Access Paper or Ask Questions