Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thang D. Bui

Sparse Gaussian Processes: Structured Approximations and Power-EP Revisited

Jul 03, 2025

Thang D. Bui, Michalis K. Titsias

Abstract:Inducing-point-based sparse variational Gaussian processes have become the standard workhorse for scaling up GP models. Recent advances show that these methods can be improved by introducing a diagonal scaling matrix to the conditional posterior density given the inducing points. This paper first considers an extension that employs a block-diagonal structure for the scaling matrix, provably tightening the variational lower bound. We then revisit the unifying framework of sparse GPs based on Power Expectation Propagation (PEP) and show that it can leverage and benefit from the new structured approximate posteriors. Through extensive regression experiments, we show that the proposed block-diagonal approximation consistently performs similarly to or better than existing diagonal approximations while maintaining comparable computational costs. Furthermore, the new PEP framework with structured posteriors provides competitive performance across various power hyperparameter settings, offering practitioners flexible alternatives to standard variational approaches.

Via

Access Paper or Ask Questions

Tighter sparse variational Gaussian processes

Feb 07, 2025

Thang D. Bui, Matthew Ashman, Richard E. Turner

Abstract:Sparse variational Gaussian process (GP) approximations based on inducing points have become the de facto standard for scaling GPs to large datasets, owing to their theoretical elegance, computational efficiency, and ease of implementation. This paper introduces a provably tighter variational approximation by relaxing the standard assumption that the conditional approximate posterior given the inducing points must match that in the prior. The key innovation is to modify the conditional posterior to have smaller variances than that of the prior at the training points. We derive the collapsed bound for the regression case, describe how to use the proposed approximation in large data settings, and discuss its application to handle orthogonally structured inducing points and GP latent variable models. Extensive experiments on regression benchmarks, classification, and latent variable models demonstrate that the proposed approximation consistently matches or outperforms standard sparse variational GPs while maintaining the same computational cost. An implementation will be made available in all popular GP packages.

Via

Access Paper or Ask Questions

Improving Uncertainty Quantification in Large Language Models via Semantic Embeddings

Oct 30, 2024

Yashvir S. Grewal, Edwin V. Bonilla, Thang D. Bui

Abstract:Accurately quantifying uncertainty in large language models (LLMs) is crucial for their reliable deployment, especially in high-stakes applications. Current state-of-the-art methods for measuring semantic uncertainty in LLMs rely on strict bidirectional entailment criteria between multiple generated responses and also depend on sequence likelihoods. While effective, these approaches often overestimate uncertainty due to their sensitivity to minor wording differences, additional correct information, and non-important words in the sequence. We propose a novel approach that leverages semantic embeddings to achieve smoother and more robust estimation of semantic uncertainty in LLMs. By capturing semantic similarities without depending on sequence likelihoods, our method inherently reduces any biases introduced by irrelevant words in the answers. Furthermore, we introduce an amortised version of our approach by explicitly modelling semantics as latent variables in a joint probabilistic model. This allows for uncertainty estimation in the embedding space with a single forward pass, significantly reducing computational overhead compared to existing multi-pass methods. Experiments across multiple question-answering datasets and frontier LLMs demonstrate that our embedding-based methods provide more accurate and nuanced uncertainty quantification than traditional approaches.

Via

Access Paper or Ask Questions

Likelihood approximations via Gaussian approximate inference

Oct 28, 2024

Thang D. Bui

Abstract:Non-Gaussian likelihoods are essential for modelling complex real-world observations but pose significant computational challenges in learning and inference. Even with Gaussian priors, non-Gaussian likelihoods often lead to analytically intractable posteriors, necessitating approximation methods. To this end, we propose efficient schemes to approximate the effects of non-Gaussian likelihoods by Gaussian densities based on variational inference and moment matching in transformed bases. These enable efficient inference strategies originally designed for models with a Gaussian likelihood to be deployed. Our empirical results demonstrate that the proposed matching strategies attain good approximation quality for binary and multiclass classification in large-scale point-estimate and distributional inferential settings. In challenging streaming problems, the proposed methods outperform all existing likelihood approximations and approximate inference methods in the exact models. As a by-product, we show that the proposed approximate log-likelihoods are a superior alternative to least-squares on raw labels for neural network classification.

Via

Access Paper or Ask Questions

Partitioned Variational Inference: A Framework for Probabilistic Federated Learning

Feb 28, 2022

Matthew Ashman, Thang D. Bui, Cuong V. Nguyen, Stratis Markou, Adrian Weller, Siddharth Swaroop, Richard E. Turner

Figure 1 for Partitioned Variational Inference: A Framework for Probabilistic Federated Learning

Figure 2 for Partitioned Variational Inference: A Framework for Probabilistic Federated Learning

Figure 3 for Partitioned Variational Inference: A Framework for Probabilistic Federated Learning

Figure 4 for Partitioned Variational Inference: A Framework for Probabilistic Federated Learning

Abstract:The proliferation of computing devices has brought about an opportunity to deploy machine learning models on new problem domains using previously inaccessible data. Traditional algorithms for training such models often require data to be stored on a single machine with compute performed by a single node, making them unsuitable for decentralised training on multiple devices. This deficiency has motivated the development of federated learning algorithms, which allow multiple data owners to train collaboratively and use a shared model whilst keeping local data private. However, many of these algorithms focus on obtaining point estimates of model parameters, rather than probabilistic estimates capable of capturing model uncertainty, which is essential in many applications. Variational inference (VI) has become the method of choice for fitting many modern probabilistic models. In this paper we introduce partitioned variational inference (PVI), a general framework for performing VI in the federated setting. We develop new supporting theory for PVI, demonstrating a number of properties that make it an attractive choice for practitioners; use PVI to unify a wealth of fragmented, yet related literature; and provide empirical results that showcase the effectiveness of PVI in a variety of federated settings.

* arXiv admin note: substantial text overlap with arXiv:1811.11206

Via

Access Paper or Ask Questions

Variational Auto-Regressive Gaussian Processes for Continual Learning

Jun 09, 2020

Sanyam Kapoor, Theofanis Karaletsos, Thang D. Bui

Figure 1 for Variational Auto-Regressive Gaussian Processes for Continual Learning

Figure 2 for Variational Auto-Regressive Gaussian Processes for Continual Learning

Figure 3 for Variational Auto-Regressive Gaussian Processes for Continual Learning

Figure 4 for Variational Auto-Regressive Gaussian Processes for Continual Learning

Abstract:This paper proposes Variational Auto-Regressive Gaussian Process (VAR-GP), a principled Bayesian updating mechanism to incorporate new data for sequential tasks in the context of continual learning. It relies on a novel auto-regressive characterization of the variational distribution and inference is made scalable using sparse inducing point approximations. Experiments on standard continual learning benchmarks demonstrate the ability of VAR-GPs to perform well at new tasks without compromising performance on old ones, yielding competitive results to state-of-the-art methods. In addition, we qualitatively show how VAR-GP improves the predictive entropy estimates as we train on new tasks. Further, we conduct a thorough ablation study to verify the effectiveness of inferential choices.

* Preprint. Under review

Via

Access Paper or Ask Questions

Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

Feb 10, 2020

Theofanis Karaletsos, Thang D. Bui

Figure 1 for Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

Figure 2 for Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

Figure 3 for Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

Figure 4 for Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

Abstract:Probabilistic neural networks are typically modeled with independent weight priors, which do not capture weight correlations in the prior and do not provide a parsimonious interface to express properties in function space. A desirable class of priors would represent weights compactly, capture correlations between weights, facilitate calibrated reasoning about uncertainty, and allow inclusion of prior knowledge about the function space such as periodicity or dependence on contexts such as inputs. To this end, this paper introduces two innovations: (i) a Gaussian process-based hierarchical model for network weights based on unit embeddings that can flexibly encode correlated weight structures, and (ii) input-dependent versions of these weight priors that can provide convenient ways to regularize the function space through the use of kernels defined on contextual inputs. We show these models provide desirable test-time uncertainty estimates on out-of-distribution data, demonstrate cases of modeling inductive biases for neural networks with kernels which help both interpolation and extrapolation from training data, and demonstrate competitive predictive performance on an active learning benchmark.

* 12 pages main paper, 13 pages appendix

Via

Access Paper or Ask Questions

Improving and Understanding Variational Continual Learning

May 06, 2019

Siddharth Swaroop, Cuong V. Nguyen, Thang D. Bui, Richard E. Turner

Figure 1 for Improving and Understanding Variational Continual Learning

Figure 2 for Improving and Understanding Variational Continual Learning

Figure 3 for Improving and Understanding Variational Continual Learning

Figure 4 for Improving and Understanding Variational Continual Learning

Abstract:In the continual learning setting, tasks are encountered sequentially. The goal is to learn whilst i) avoiding catastrophic forgetting, ii) efficiently using model capacity, and iii) employing forward and backward transfer learning. In this paper, we explore how the Variational Continual Learning (VCL) framework achieves these desiderata on two benchmarks in continual learning: split MNIST and permuted MNIST. We first report significantly improved results on what was already a competitive approach. The improvements are achieved by establishing a new best practice approach to mean-field variational Bayesian neural networks. We then look at the solutions in detail. This allows us to obtain an understanding of why VCL performs as it does, and we compare the solution to what an `ideal' continual learning solution might be.

Via

Access Paper or Ask Questions

Partitioned Variational Inference: A unified framework encompassing federated and continual learning

Nov 27, 2018

Thang D. Bui, Cuong V. Nguyen, Siddharth Swaroop, Richard E. Turner

Figure 1 for Partitioned Variational Inference: A unified framework encompassing federated and continual learning

Figure 2 for Partitioned Variational Inference: A unified framework encompassing federated and continual learning

Figure 3 for Partitioned Variational Inference: A unified framework encompassing federated and continual learning

Figure 4 for Partitioned Variational Inference: A unified framework encompassing federated and continual learning

Abstract:Variational inference (VI) has become the method of choice for fitting many modern probabilistic models. However, practitioners are faced with a fragmented literature that offers a bewildering array of algorithmic options. First, the variational family. Second, the granularity of the updates e.g. whether the updates are local to each data point and employ message passing or global. Third, the method of optimization (bespoke or blackbox, closed-form or stochastic updates, etc.). This paper presents a new framework, termed Partitioned Variational Inference (PVI), that explicitly acknowledges these algorithmic dimensions of VI, unifies disparate literature, and provides guidance on usage. Crucially, the proposed PVI framework allows us to identify new ways of performing VI that are ideally suited to challenging learning scenarios including federated learning (where distributed computing is leveraged to process non-centralized data) and continual learning (where new data and tasks arrive over time and must be accommodated quickly). We showcase these new capabilities by developing communication-efficient federated training of Bayesian neural networks and continual learning for Gaussian process models with private pseudo-points. The new methods significantly outperform the state-of-the-art, whilst being almost as straightforward to implement as standard VI.

Via

Access Paper or Ask Questions

Variational Continual Learning

May 20, 2018

Cuong V. Nguyen, Yingzhen Li, Thang D. Bui, Richard E. Turner

Figure 1 for Variational Continual Learning

Figure 2 for Variational Continual Learning

Figure 3 for Variational Continual Learning

Figure 4 for Variational Continual Learning

Abstract:This paper develops variational continual learning (VCL), a simple but general framework for continual learning that fuses online variational inference (VI) and recent advances in Monte Carlo VI for neural networks. The framework can successfully train both deep discriminative models and deep generative models in complex continual learning settings where existing tasks evolve over time and entirely new tasks emerge. Experimental results show that VCL outperforms state-of-the-art continual learning methods on a variety of tasks, avoiding catastrophic forgetting in a fully automatic way.

* Published at International Conference on Learning Representations (ICLR) 2018

Via

Access Paper or Ask Questions