Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tim G. J. Rudner

A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction

Oct 30, 2024

Qidong Yang, Weicheng Zhu, Joseph Keslin, Laure Zanna, Tim G. J. Rudner, Carlos Fernandez-Granda

Figure 1 for A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction

Figure 2 for A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction

Figure 3 for A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction

Figure 4 for A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction

Abstract:Probabilistic prediction of sequences from images and other high-dimensional data is a key challenge, particularly in risk-sensitive applications. In these settings, it is often desirable to quantify the uncertainty associated with the prediction (instead of just determining the most likely sequence, as in language modeling). In this paper, we propose a Monte Carlo framework to estimate probabilities and confidence intervals associated with the distribution of a discrete sequence. Our framework uses a Monte Carlo simulator, implemented as an autoregressively trained neural network, to sample sequences conditioned on an image input. We then use these samples to estimate the probabilities and confidence intervals. Experiments on synthetic and real data show that the framework produces accurate discriminative predictions, but can suffer from miscalibration. In order to address this shortcoming, we propose a time-dependent regularization method, which is shown to produce calibrated predictions.

Via

Access Paper or Ask Questions

Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds

Oct 16, 2024

Xingzhi Sun, Danqi Liao, Kincaid MacDonald, Yanlei Zhang, Chen Liu, Guillaume Huguet, Guy Wolf, Ian Adelstein, Tim G. J. Rudner, Smita Krishnaswamy

Figure 1 for Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds

Figure 2 for Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds

Figure 3 for Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds

Figure 4 for Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds

Abstract:Rapid growth of high-dimensional datasets in fields such as single-cell RNA sequencing and spatial genomics has led to unprecedented opportunities for scientific discovery, but it also presents unique computational and statistical challenges. Traditional methods struggle with geometry-aware data generation, interpolation along meaningful trajectories, and transporting populations via feasible paths. To address these issues, we introduce Geometry-Aware Generative Autoencoder (GAGA), a novel framework that combines extensible manifold learning with generative modeling. GAGA constructs a neural network embedding space that respects the intrinsic geometries discovered by manifold learning and learns a novel warped Riemannian metric on the data space. This warped metric is derived from both the points on the data manifold and negative samples off the manifold, allowing it to characterize a meaningful geometry across the entire latent space. Using this metric, GAGA can uniformly sample points on the manifold, generate points along geodesics, and interpolate between populations across the learned manifold. GAGA shows competitive performance in simulated and real world datasets, including a 30% improvement over the state-of-the-art methods in single-cell population-level trajectory inference.

Via

Access Paper or Ask Questions

Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

Jul 16, 2024

Leo Klarner, Tim G. J. Rudner, Garrett M. Morris, Charlotte M. Deane, Yee Whye Teh

Figure 1 for Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

Figure 2 for Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

Figure 3 for Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

Figure 4 for Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

Abstract:Generative models have the potential to accelerate key steps in the discovery of novel molecular therapeutics and materials. Diffusion models have recently emerged as a powerful approach, excelling at unconditional sample generation and, with data-driven guidance, conditional generation within their training domain. Reliably sampling from high-value regions beyond the training data, however, remains an open challenge -- with current methods predominantly focusing on modifying the diffusion process itself. In this paper, we develop context-guided diffusion (CGD), a simple plug-and-play method that leverages unlabeled data and smoothness constraints to improve the out-of-distribution generalization of guided diffusion models. We demonstrate that this approach leads to substantial performance gains across various settings, including continuous, discrete, and graph-structured diffusion processes with applications across drug discovery, materials science, and protein design.

* Published in the Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

Via

Access Paper or Ask Questions

Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

May 09, 2024

Gunshi Gupta, Karmesh Yadav, Yarin Gal, Dhruv Batra, Zsolt Kira, Cong Lu, Tim G. J. Rudner

Figure 1 for Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

Figure 2 for Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

Figure 3 for Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

Figure 4 for Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

Abstract:Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs. Such capabilities are difficult to learn solely from task-specific data. This has led to the emergence of pre-trained vision-language models as a tool for transferring representations learned from internet-scale data to downstream tasks and new domains. However, commonly used contrastively trained representations such as in CLIP have been shown to fail at enabling embodied agents to gain a sufficiently fine-grained scene understanding -- a capability vital for control. To address this shortcoming, we consider representations from pre-trained text-to-image diffusion models, which are explicitly optimized to generate images from text prompts and as such, contain text-conditioned representations that reflect highly fine-grained visuo-spatial information. Using pre-trained text-to-image diffusion models, we construct Stable Control Representations which allow learning downstream control policies that generalize to complex, open-ended environments. We show that policies learned using Stable Control Representations are competitive with state-of-the-art representation learning approaches across a broad range of simulated control settings, encompassing challenging manipulation and navigation tasks. Most notably, we show that Stable Control Representations enable learning policies that exhibit state-of-the-art performance on OVMM, a difficult open-vocabulary navigation benchmark.

Via

Access Paper or Ask Questions

Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

Apr 27, 2024

Yunzhen Feng, Tim G. J. Rudner, Nikolaos Tsilivis, Julia Kempe

Figure 1 for Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

Figure 2 for Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

Figure 3 for Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

Figure 4 for Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

Abstract:Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations. In this work, we examine this claim. To study the adversarial robustness of BNNs, we investigate whether it is possible to successfully break state-of-the-art BNN inference methods and prediction pipelines using even relatively unsophisticated attacks for three tasks: (1) label prediction under the posterior predictive mean, (2) adversarial example detection with Bayesian predictive uncertainty, and (3) semantic shift detection. We find that BNNs trained with state-of-the-art approximate inference methods, and even BNNs trained with Hamiltonian Monte Carlo, are highly susceptible to adversarial attacks. We also identify various conceptual and experimental errors in previous works that claimed inherent adversarial robustness of BNNs and conclusively demonstrate that BNNs and uncertainty-aware Bayesian prediction pipelines are not inherently robust against adversarial attacks.

Via

Access Paper or Ask Questions

Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors

Mar 14, 2024

Tim G. J. Rudner, Ya Shi Zhang, Andrew Gordon Wilson, Julia Kempe

Abstract:Machine learning models often perform poorly under subpopulation shifts in the data distribution. Developing methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this paper, we develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well under subpopulation shifts. We design a simple group-aware prior that only requires access to a small set of data with group information and demonstrate that training with this prior yields state-of-the-art performance -- even when only retraining the final layer of a previously trained non-robust model. Group aware-priors are conceptually simple, complementary to existing approaches, such as attribute pseudo labeling and data reweighting, and open up promising new avenues for harnessing Bayesian inference to enable robustness to subpopulation shifts.

* Published in Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

Via

Access Paper or Ask Questions

Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI

Feb 06, 2024

Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, Jose Miguel Hernandez Lobato(+15 more)

Figure 1 for Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI

Figure 2 for Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI

Abstract:In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learning (BDL) constitutes a promising avenue, offering advantages across these diverse settings. This paper posits that BDL can elevate the capabilities of deep learning. It revisits the strengths of BDL, acknowledges existing challenges, and highlights some exciting research avenues aimed at addressing these obstacles. Looking ahead, the discussion focuses on possible ways to combine large-scale foundation models with BDL to unlock their full potential.

Via

Access Paper or Ask Questions

Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution

Dec 28, 2023

Ying Wang, Tim G. J. Rudner, Andrew Gordon Wilson

Abstract:Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability. To improve the interpretability of vision-language models such as CLIP, we propose a multi-modal information bottleneck (M2IB) approach that learns latent representations that compress irrelevant information while preserving relevant visual and textual features. We demonstrate how M2IB can be applied to attribution analysis of vision-language pretrained models, increasing attribution accuracy and improving the interpretability of such models when applied to safety-critical domains such as healthcare. Crucially, unlike commonly used unimodal attribution methods, M2IB does not require ground truth labels, making it possible to audit representations of vision-language pretrained models when multiple modalities but no ground-truth data is available. Using CLIP as an example, we demonstrate the effectiveness of M2IB attribution and show that it outperforms gradient-based, perturbation-based, and attention-based attribution methods both qualitatively and quantitatively.

* Published in Advances in Neural Information Processing Systems 36 (NeurIPS 2023)

Via

Access Paper or Ask Questions

Continual Learning via Sequential Function-Space Variational Inference

Dec 28, 2023

Tim G. J. Rudner, Freddie Bickford Smith, Qixuan Feng, Yee Whye Teh, Yarin Gal

Abstract:Sequential Bayesian inference over predictive functions is a natural framework for continual learning from streams of data. However, applying it to neural networks has proved challenging in practice. Addressing the drawbacks of existing techniques, we propose an optimization objective derived by formulating continual learning as sequential function-space variational inference. In contrast to existing methods that regularize neural network parameters directly, this objective allows parameters to vary widely during training, enabling better adaptation to new tasks. Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions and more effective regularization. We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods while depending less on maintaining a set of representative points from previous tasks.

* Published in Proceedings of the 39th International Conference on Machine Learning (ICML 2022)

Via

Access Paper or Ask Questions

Tractable Function-Space Variational Inference in Bayesian Neural Networks

Dec 28, 2023

Tim G. J. Rudner, Zonghao Chen, Yee Whye Teh, Yarin Gal

Figure 1 for Tractable Function-Space Variational Inference in Bayesian Neural Networks

Figure 2 for Tractable Function-Space Variational Inference in Bayesian Neural Networks

Figure 3 for Tractable Function-Space Variational Inference in Bayesian Neural Networks

Figure 4 for Tractable Function-Space Variational Inference in Bayesian Neural Networks

Abstract:Reliable predictive uncertainty estimation plays an important role in enabling the deployment of neural networks to safety-critical settings. A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters, infer an approximate posterior distribution, and use it to make stochastic predictions. However, explicit inference over neural network parameters makes it difficult to incorporate meaningful prior information about the data-generating process into the model. In this paper, we pursue an alternative approach. Recognizing that the primary object of interest in most settings is the distribution over functions induced by the posterior distribution over neural network parameters, we frame Bayesian inference in neural networks explicitly as inferring a posterior distribution over functions and propose a scalable function-space variational inference method that allows incorporating prior information and results in reliable predictive uncertainty estimates. We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks and demonstrate that it performs well on a challenging safety-critical medical diagnosis task in which reliable uncertainty estimation is essential.

* Published in Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

Via

Access Paper or Ask Questions