Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Samuel Kaski

Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Department of Computer Science, University of Manchester

Parallel MCMC Without Embarrassing Failures

Mar 29, 2022
Daniel Augusto de Souza, Diego Mesquita, Samuel Kaski, Luigi Acerbi

Figure 1 for Parallel MCMC Without Embarrassing Failures

Figure 2 for Parallel MCMC Without Embarrassing Failures

Figure 3 for Parallel MCMC Without Embarrassing Failures

Figure 4 for Parallel MCMC Without Embarrassing Failures

Embarrassingly parallel Markov Chain Monte Carlo (MCMC) exploits parallel computing to scale Bayesian inference to large datasets by using a two-step approach. First, MCMC is run in parallel on (sub)posteriors defined on data partitions. Then, a server combines local results. While efficient, this framework is very sensitive to the quality of subposterior sampling. Common sampling problems such as missing modes or misrepresentation of low-density regions are amplified -- instead of being corrected -- in the combination phase, leading to catastrophic failures. In this work, we propose a novel combination strategy to mitigate this issue. Our strategy, Parallel Active Inference (PAI), leverages Gaussian Process (GP) surrogate modeling and active learning. After fitting GPs to subposteriors, PAI (i) shares information between GP surrogates to cover missing modes; and (ii) uses active sampling to individually refine subposterior approximations. We validate PAI in challenging benchmarks, including heavy-tailed and multi-modal posteriors and a real-world application to computational neuroscience. Empirical results show that PAI succeeds where previous methods catastrophically fail, with a small communication overhead.

* To appear in the 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022). For associated code, see https://github.com/spectraldani/pai/

Via

Access Paper or Ask Questions

Zero-Shot Assistance in Novel Decision Problems

Feb 15, 2022
Sebastiaan De Peuter, Samuel Kaski

Figure 1 for Zero-Shot Assistance in Novel Decision Problems

Figure 2 for Zero-Shot Assistance in Novel Decision Problems

Figure 3 for Zero-Shot Assistance in Novel Decision Problems

Figure 4 for Zero-Shot Assistance in Novel Decision Problems

We consider the problem of creating assistants that can help agents - often humans - solve novel sequential decision problems, assuming the agent is not able to specify the reward function explicitly to the assistant. Instead of aiming to automate, and act in place of the agent as in current approaches, we give the assistant an advisory role and keep the agent in the loop as the main decision maker. The difficulty is that we must account for potential biases induced by limitations or constraints of the agent which may cause it to seemingly irrationally reject advice. To do this we introduce a novel formalization of assistance that models these biases, allowing the assistant to infer and adapt to them. We then introduce a new method for planning the assistant's advice which can scale to large decision making problems. Finally, we show experimentally that our approach adapts to these agent biases, and results in higher cumulative reward for the agent than automation-based alternatives.

* 14 pages, 9 figures

Via

Access Paper or Ask Questions

Deconfounded Representation Similarity for Comparison of Neural Networks

Jan 31, 2022
Tianyu Cui, Yogesh Kumar, Pekka Marttinen, Samuel Kaski

Figure 1 for Deconfounded Representation Similarity for Comparison of Neural Networks

Figure 2 for Deconfounded Representation Similarity for Comparison of Neural Networks

Figure 3 for Deconfounded Representation Similarity for Comparison of Neural Networks

Figure 4 for Deconfounded Representation Similarity for Comparison of Neural Networks

Similarity metrics such as representational similarity analysis (RSA) and centered kernel alignment (CKA) have been used to compare layer-wise representations between neural networks. However, these metrics are confounded by the population structure of data items in the input space, leading to spuriously high similarity for even completely random neural networks and inconsistent domain relations in transfer learning. We introduce a simple and generally applicable fix to adjust for the confounder with covariate adjustment regression, which retains the intuitive invariance properties of the original similarity measures. We show that deconfounding the similarity metrics increases the resolution of detecting semantically similar neural networks. Moreover, in real-world applications, deconfounding improves the consistency of representation similarities with domain similarities in transfer learning, and increases correlation with out-of-distribution accuracy.

Via

Access Paper or Ask Questions

Approximate Bayesian Computation with Domain Expert in the Loop

Jan 28, 2022
Ayush Bharti, Louis Filstroff, Samuel Kaski

Figure 1 for Approximate Bayesian Computation with Domain Expert in the Loop

Figure 2 for Approximate Bayesian Computation with Domain Expert in the Loop

Figure 3 for Approximate Bayesian Computation with Domain Expert in the Loop

Figure 4 for Approximate Bayesian Computation with Domain Expert in the Loop

Approximate Bayesian computation (ABC) is a popular likelihood-free inference method for models with intractable likelihood functions. As ABC methods usually rely on comparing summary statistics of observed and simulated data, the choice of the statistics is crucial. This choice involves a trade-off between loss of information and dimensionality reduction, and is often determined based on domain knowledge. However, handcrafting and selecting suitable statistics is a laborious task involving multiple trial-and-error steps. In this work, we introduce an active learning method for ABC statistics selection which reduces the domain expert's work considerably. By involving the experts, we are able to handle misspecified models, unlike the existing dimension reduction methods. Moreover, empirical results show better posterior estimates than with existing methods, when the simulation budget is limited.

Via

Access Paper or Ask Questions

Non-separable Spatio-temporal Graph Kernels via SPDEs

Nov 16, 2021
Alexander Nikitin, ST John, Arno Solin, Samuel Kaski

Figure 1 for Non-separable Spatio-temporal Graph Kernels via SPDEs

Figure 2 for Non-separable Spatio-temporal Graph Kernels via SPDEs

Figure 3 for Non-separable Spatio-temporal Graph Kernels via SPDEs

Figure 4 for Non-separable Spatio-temporal Graph Kernels via SPDEs

Gaussian processes (GPs) provide a principled and direct approach for inference and learning on graphs. However, the lack of justified graph kernels for spatio-temporal modelling has held back their use in graph problems. We leverage an explicit link between stochastic partial differential equations (SPDEs) and GPs on graphs, and derive non-separable spatio-temporal graph kernels that capture interaction across space and time. We formulate the graph kernels for the stochastic heat equation and wave equation. We show that by providing novel tools for spatio-temporal GP modelling on graphs, we outperform pre-existing graph kernels in real-world applications that feature diffusion, oscillation, and other complicated interactions.

Via

Access Paper or Ask Questions

Likelihood-Free Inference in State-Space Models with Unknown Dynamics

Nov 02, 2021
Alexander Aushev, Thong Tran, Henri Pesonen, Andrew Howes, Samuel Kaski

Figure 1 for Likelihood-Free Inference in State-Space Models with Unknown Dynamics

Figure 2 for Likelihood-Free Inference in State-Space Models with Unknown Dynamics

Figure 3 for Likelihood-Free Inference in State-Space Models with Unknown Dynamics

Figure 4 for Likelihood-Free Inference in State-Space Models with Unknown Dynamics

We introduce a method for inferring and predicting latent states in the important and difficult case of state-space models where observations can only be simulated, and transition dynamics are unknown. In this setting, the likelihood of observations is not available and only synthetic observations can be generated from a black-box simulator. We propose a way of doing likelihood-free inference (LFI) of states and state prediction with a limited number of simulations. Our approach uses a multi-output Gaussian process for state inference, and a Bayesian Neural Network as a model of the transition dynamics for state prediction. We improve upon existing LFI methods for the inference task, while also accurately learning transition dynamics. The proposed method is necessary for modelling inverse problems in dynamical systems with computationally expensive simulations, as demonstrated in experiments with non-stationary user models.

* 20 pages, 8 figures, uses arxiv.sty

Via

Access Paper or Ask Questions

Locally Differentially Private Bayesian Inference

Oct 27, 2021
Tejas Kulkarni, Joonas Jälkö, Samuel Kaski, Antti Honkela

Figure 1 for Locally Differentially Private Bayesian Inference

Figure 2 for Locally Differentially Private Bayesian Inference

Figure 3 for Locally Differentially Private Bayesian Inference

Figure 4 for Locally Differentially Private Bayesian Inference

In recent years, local differential privacy (LDP) has emerged as a technique of choice for privacy-preserving data collection in several scenarios when the aggregator is not trustworthy. LDP provides client-side privacy by adding noise at the user's end. Thus, clients need not rely on the trustworthiness of the aggregator. In this work, we provide a noise-aware probabilistic modeling framework, which allows Bayesian inference to take into account the noise added for privacy under LDP, conditioned on locally perturbed observations. Stronger privacy protection (compared to the central model) provided by LDP protocols comes at a much harsher privacy-utility trade-off. Our framework tackles several computational and statistical challenges posed by LDP for accurate uncertainty quantification under Bayesian settings. We demonstrate the efficacy of our framework in parameter estimation for univariate and multi-variate distributions as well as logistic and linear regression.

Via

Access Paper or Ask Questions

De-randomizing MCMC dynamics with the diffusion Stein operator

Oct 07, 2021
Zheyang Shen, Markus Heinonen, Samuel Kaski

Figure 1 for De-randomizing MCMC dynamics with the diffusion Stein operator

Figure 2 for De-randomizing MCMC dynamics with the diffusion Stein operator

Figure 3 for De-randomizing MCMC dynamics with the diffusion Stein operator

Figure 4 for De-randomizing MCMC dynamics with the diffusion Stein operator

Approximate Bayesian inference estimates descriptors of an intractable target distribution - in essence, an optimization problem within a family of distributions. For example, Langevin dynamics (LD) extracts asymptotically exact samples from a diffusion process because the time evolution of its marginal distributions constitutes a curve that minimizes the KL-divergence via steepest descent in the Wasserstein space. Parallel to LD, Stein variational gradient descent (SVGD) similarly minimizes the KL, albeit endowed with a novel Stein-Wasserstein distance, by deterministically transporting a set of particle samples, thus de-randomizes the stochastic diffusion process. We propose de-randomized kernel-based particle samplers to all diffusion-based samplers known as MCMC dynamics. Following previous work in interpreting MCMC dynamics, we equip the Stein-Wasserstein space with a fiber-Riemannian Poisson structure, with the capacity of characterizing a fiber-gradient Hamiltonian flow that simulates MCMC dynamics. Such dynamics discretizes into generalized SVGD (GSVGD), a Stein-type deterministic particle sampler, with particle updates coinciding with applying the diffusion Stein operator to a kernel function. We demonstrate empirically that GSVGD can de-randomize complex MCMC dynamics, which combine the advantages of auxiliary momentum variables and Riemannian structure, while maintaining the high sample quality from an interacting particle system.

* 22 pages, 6 figures. NeurIPS 2021

Via

Access Paper or Ask Questions

Learning to Assist Agents by Observing Them

Oct 04, 2021
Antti Keurulainen, Isak Westerlund, Samuel Kaski, Alexander Ilin

Figure 1 for Learning to Assist Agents by Observing Them

Figure 2 for Learning to Assist Agents by Observing Them

Figure 3 for Learning to Assist Agents by Observing Them

Figure 4 for Learning to Assist Agents by Observing Them

The ability of an AI agent to assist other agents, such as humans, is an important and challenging goal, which requires the assisting agent to reason about the behavior and infer the goals of the assisted agent. Training such an ability by using reinforcement learning usually requires large amounts of online training, which is difficult and costly. On the other hand, offline data about the behavior of the assisted agent might be available, but is non-trivial to take advantage of by methods such as offline reinforcement learning. We introduce methods where the capability to create a representation of the behavior is first pre-trained with offline data, after which only a small amount of interaction data is needed to learn an assisting policy. We test the setting in a gridworld where the helper agent has the capability to manipulate the environment of the assisted artificial agents, and introduce three different scenarios where the assistance considerably improves the performance of the assisted agents.

Via

Access Paper or Ask Questions