Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lee Devlin

Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals

Jul 10, 2025

Joshua Murphy, Conor Rosato, Andrew Millard, Lee Devlin, Paul Horridge, Simon Maskell

Figure 1 for Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals

Figure 2 for Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals

Abstract:When performing Bayesian inference using Sequential Monte Carlo (SMC) methods, two considerations arise: the accuracy of the posterior approximation and computational efficiency. To address computational demands, Sequential Monte Carlo Squared (SMC$^2$) is well-suited for high-performance computing (HPC) environments. The design of the proposal distribution within SMC$^2$ can improve accuracy and exploration of the posterior as poor proposals may lead to high variance in importance weights and particle degeneracy. The Metropolis-Adjusted Langevin Algorithm (MALA) uses gradient information so that particles preferentially explore regions of higher probability. In this paper, we extend this idea by incorporating second-order information, specifically the Hessian of the log-target. While second-order proposals have been explored previously in particle Markov Chain Monte Carlo (p-MCMC) methods, we are the first to introduce them within the SMC$^2$ framework. Second-order proposals not only use the gradient (first-order derivative), but also the curvature (second-order derivative) of the target distribution. Experimental results on synthetic models highlight the benefits of our approach in terms of step-size selection and posterior approximation accuracy when compared to other proposals.

* Accepted to IEEE Machine Learning Signal Processing conference 2025

Via

Access Paper or Ask Questions

Efficient MCMC Sampling with Expensive-to-Compute and Irregular Likelihoods

May 15, 2025

Conor Rosato, Harvinder Lehal, Simon Maskell, Lee Devlin, Malcolm Strens

Abstract:Bayesian inference with Markov Chain Monte Carlo (MCMC) is challenging when the likelihood function is irregular and expensive to compute. We explore several sampling algorithms that make use of subset evaluations to reduce computational overhead. We adapt the subset samplers for this setting where gradient information is not available or is unreliable. To achieve this, we introduce data-driven proxies in place of Taylor expansions and define a novel computation-cost aware adaptive controller. We undertake an extensive evaluation for a challenging disease modelling task and a configurable task with similar irregularity in the likelihood surface. We find our improved version of Hierarchical Importance with Nested Training Samples (HINTS), with adaptive proposals and a data-driven proxy, obtains the best sampling error in a fixed computational budget. We conclude that subset evaluations can provide cheap and naturally-tempered exploration, while a data-driven proxy can pre-screen proposals successfully in explored regions of the state space. These two elements combine through hierarchical delayed acceptance to achieve efficient, exact sampling.

* 45 pages

Via

Access Paper or Ask Questions

An Entropic Metric for Measuring Calibration of Machine Learning Models

Feb 20, 2025

Daniel James Sumler, Lee Devlin, Simon Maskell, Richard O. Lane

Abstract:Understanding the confidence with which a machine learning model classifies an input datum is an important, and perhaps under-investigated, concept. In this paper, we propose a new calibration metric, the Entropic Calibration Difference (ECD). Based on existing research in the field of state estimation, specifically target tracking (TT), we show how ECD may be applied to binary classification machine learning models. We describe the relative importance of under- and over-confidence and how they are not conflated in the TT literature. Indeed, our metric distinguishes under- from over-confidence. We consider this important given that algorithms that are under-confident are likely to be 'safer' than algorithms that are over-confident, albeit at the expense of also being over-cautious and so statistically inefficient. We demonstrate how this new metric performs on real and simulated data and compare with other metrics for machine learning model probability calibration, including the Expected Calibration Error (ECE) and its signed counterpart, the Expected Signed Calibration Error (ESCE).

Via

Access Paper or Ask Questions