Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jens Sjölund

Learning to accelerate distributed ADMM using graph neural networks

Sep 05, 2025

Henri Doerks, Paul Häusner, Daniel Hernández Escobar, Jens Sjölund

Abstract:Distributed optimization is fundamental in large-scale machine learning and control applications. Among existing methods, the Alternating Direction Method of Multipliers (ADMM) has gained popularity due to its strong convergence guarantees and suitability for decentralized computation. However, ADMM often suffers from slow convergence and sensitivity to hyperparameter choices. In this work, we show that distributed ADMM iterations can be naturally represented within the message-passing framework of graph neural networks (GNNs). Building on this connection, we propose to learn adaptive step sizes and communication weights by a graph neural network that predicts the hyperparameters based on the iterates. By unrolling ADMM for a fixed number of iterations, we train the network parameters end-to-end to minimize the final iterates error for a given problem class, while preserving the algorithm's convergence properties. Numerical experiments demonstrate that our learned variant consistently improves convergence speed and solution quality compared to standard ADMM. The code is available at https://github.com/paulhausner/learning-distributed-admm.

* Under review, the first two authors contributed equally

Via

Access Paper or Ask Questions

Machine learning for in-situ composition mapping in a self-driving magnetron sputtering system

Jun 06, 2025

Sanna Jarl, Jens Sjölund, Robert J. W. Frost, Anders Holst, Jonathan J. S. Scragg

Figure 1 for Machine learning for in-situ composition mapping in a self-driving magnetron sputtering system

Figure 2 for Machine learning for in-situ composition mapping in a self-driving magnetron sputtering system

Figure 3 for Machine learning for in-situ composition mapping in a self-driving magnetron sputtering system

Figure 4 for Machine learning for in-situ composition mapping in a self-driving magnetron sputtering system

Abstract:Self-driving labs (SDLs), employing automation and machine learning (ML) to accelerate experimental procedures, have enormous potential in the discovery of new materials. However, in thin film science, SDLs are mainly restricted to solution-based synthetic methods which are easier to automate but cannot access the broad chemical space of inorganic materials. This work presents an SDL based on magnetron co-sputtering. We are using combinatorial frameworks, obtaining accurate composition maps on multi-element, compositionally graded thin films. This normally requires time-consuming ex-situ analysis prone to systematic errors. We present a rapid and calibration-free in-situ, ML driven approach to produce composition maps for arbitrary source combinations and sputtering conditions. We develop a method to predict the composition distribution in a multi-element combinatorial thin film, using in-situ measurements from quartz-crystal microbalance sensors placed in a sputter chamber. For a given source, the sensor readings are learned as a function of the sputtering pressure and magnetron power, through active learning using Gaussian processes (GPs). The final GPs are combined with a geometric model of the deposition flux distribution in the chamber, which allows interpolation of the deposition rates from each source, at any position across the sample. We investigate several acquisition functions for the ML procedure. A fully Bayesian GP - BALM (Bayesian active learning MacKay) - achieved the best performance, learning the deposition rates for a single source in 10 experiments. Prediction accuracy for co-sputtering composition distributions was verified experimentally. Our framework dramatically increases throughput by avoiding the need for extensive characterisation or calibration, thus demonstrating the potential of ML-guided SDLs to accelerate materials exploration.

* 24 pages, 10 figures. Submitted to the journal npj computational materials

Via

Access Paper or Ask Questions

Forward-only Diffusion Probabilistic Models

May 22, 2025

Ziwei Luo, Fredrik K. Gustafsson, Jens Sjölund, Thomas B. Schön

Abstract:This work presents a forward-only diffusion (FoD) approach for generative modelling. In contrast to traditional diffusion models that rely on a coupled forward-backward diffusion scheme, FoD directly learns data generation through a single forward diffusion process, yielding a simple yet efficient generative framework. The core of FoD is a state-dependent linear stochastic differential equation that involves a mean-reverting term in both the drift and diffusion functions. This mean-reversion property guarantees the convergence to clean data, naturally simulating a stochastic interpolation between source and target distributions. More importantly, FoD is analytically tractable and is trained using a simple stochastic flow matching objective, enabling a few-step non-Markov chain sampling during inference. The proposed FoD model, despite its simplicity, achieves competitive performance on various image-conditioned (e.g., image restoration) and unconditional generation tasks, demonstrating its effectiveness in generative modelling. Our code is available at https://github.com/Algolzw/FoD.

* Project page: https://algolzw.github.io/fod

Via

Access Paper or Ask Questions

Towards Better Sample Efficiency in Multi-Agent Reinforcement Learning via Exploration

Mar 17, 2025

Amir Baghi, Jens Sjölund, Joakim Bergdahl, Linus Gisslén, Alessandro Sestini

Abstract:Multi-agent reinforcement learning has shown promise in learning cooperative behaviors in team-based environments. However, such methods often demand extensive training time. For instance, the state-of-the-art method TiZero takes 40 days to train high-quality policies for a football environment. In this paper, we hypothesize that better exploration mechanisms can improve the sample efficiency of multi-agent methods. We propose two different approaches for better exploration in TiZero: a self-supervised intrinsic reward and a random network distillation bonus. Additionally, we introduce architectural modifications to the original algorithm to enhance TiZero's computational efficiency. We evaluate the sample efficiency of these approaches through extensive experiments. Our results show that random network distillation improves training sample efficiency by 18.8% compared to the original TiZero. Furthermore, we evaluate the qualitative behavior of the models produced by both variants against a heuristic AI, with the self-supervised reward encouraging possession and random network distillation leading to a more offensive performance. Our results highlights the applicability of our random network distillation variant in practical settings. Lastly, due to the nature of the proposed method, we acknowledge its use beyond football simulation, especially in environments with strong multi-agent and strategic aspects.

* 8 pages, 3 figures

Via

Access Paper or Ask Questions

Online learning in motion modeling for intra-interventional image sequences

Oct 15, 2024

Niklas Gunnarsson, Jens Sjölund, Peter Kimstrand, Thomas. B Schön

Abstract:Image monitoring and guidance during medical examinations can aid both diagnosis and treatment. However, the sampling frequency is often too low, which creates a need to estimate the missing images. We present a probabilistic motion model for sequential medical images, with the ability to both estimate motion between acquired images and forecast the motion ahead of time. The core is a low-dimensional temporal process based on a linear Gaussian state-space model with analytically tractable solutions for forecasting, simulation, and imputation of missing samples. The results, from two experiments on publicly available cardiac datasets, show reliable motion estimates and an improved forecasting performance using patient-specific adaptation by online learning.

* Medical Image Computing and Computer Assisted Intervention (MICCAI) 2024

Via

Access Paper or Ask Questions

Taming Diffusion Models for Image Restoration: A Review

Sep 16, 2024

Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön

Figure 1 for Taming Diffusion Models for Image Restoration: A Review

Figure 2 for Taming Diffusion Models for Image Restoration: A Review

Figure 3 for Taming Diffusion Models for Image Restoration: A Review

Figure 4 for Taming Diffusion Models for Image Restoration: A Review

Abstract:Diffusion models have achieved remarkable progress in generative modelling, particularly in enhancing image quality to conform to human preferences. Recently, these models have also been applied to low-level computer vision for photo-realistic image restoration (IR) in tasks such as image denoising, deblurring, dehazing, etc. In this review paper, we introduce key constructions in diffusion models and survey contemporary techniques that make use of diffusion models in solving general IR tasks. Furthermore, we point out the main challenges and limitations of existing diffusion-based IR frameworks and provide potential directions for future work.

* Review paper; any comments and suggestions are most welcome!

Via

Access Paper or Ask Questions

Conditional sampling within generative diffusion models

Sep 15, 2024

Zheng Zhao, Ziwei Luo, Jens Sjölund, Thomas B. Schön

Abstract:Generative diffusions are a powerful class of Monte Carlo samplers that leverage bridging Markov processes to approximate complex, high-dimensional distributions, such as those found in image processing and language models. Despite their success in these domains, an important open challenge remains: extending these techniques to sample from conditional distributions, as required in, for example, Bayesian inverse problems. In this paper, we present a comprehensive review of existing computational approaches to conditional sampling within generative diffusion models. Specifically, we highlight key methodologies that either utilise the joint distribution, or rely on (pre-trained) marginal distributions with explicit likelihoods, to construct conditional generative samplers.

Via

Access Paper or Ask Questions

Learning incomplete factorization preconditioners for GMRES

Sep 12, 2024

Paul Häusner, Aleix Nieto Juscafresa, Jens Sjölund

Abstract:In this paper, we develop a data-driven approach to generate incomplete LU factorizations of large-scale sparse matrices. The learned approximate factorization is utilized as a preconditioner for the corresponding linear equation system in the GMRES method. Incomplete factorization methods are one of the most commonly applied algebraic preconditioners for sparse linear equation systems and are able to speed up the convergence of Krylov subspace methods. However, they are sensitive to hyper-parameters and might suffer from numerical breakdown or lead to slow convergence when not properly applied. We replace the typically hand-engineered algorithms with a graph neural network based approach that is trained against data to predict an approximate factorization. This allows us to learn preconditioners tailored for a specific problem distribution. We analyze and empirically evaluate different loss functions to train the learned preconditioners and show their effectiveness to decrease the number of GMRES iterations and improve the spectral properties on our synthetic dataset. The code is available at https://github.com/paulhausner/neural-incomplete-factorization.

* The first two authors contributed equally, under review. 12 pages, 5 figures

Via

Access Paper or Ask Questions

Conditioning diffusion models by explicit forward-backward bridging

May 22, 2024

Adrien Corenflos, Zheng Zhao, Simo Särkkä, Jens Sjölund, Thomas B. Schön

Figure 1 for Conditioning diffusion models by explicit forward-backward bridging

Figure 2 for Conditioning diffusion models by explicit forward-backward bridging

Figure 3 for Conditioning diffusion models by explicit forward-backward bridging

Figure 4 for Conditioning diffusion models by explicit forward-backward bridging

Abstract:Given an unconditional diffusion model $\pi(x, y)$, using it to perform conditional simulation $\pi(x \mid y)$ is still largely an open question and is typically achieved by learning conditional drifts to the denoising SDE after the fact. In this work, we express conditional simulation as an inference problem on an augmented space corresponding to a partial SDE bridge. This perspective allows us to implement efficient and principled particle Gibbs and pseudo-marginal samplers marginally targeting the conditional distribution $\pi(x \mid y)$. Contrary to existing methodology, our methods do not introduce any additional approximation to the unconditional diffusion model aside from the Monte Carlo error. We showcase the benefits and drawbacks of our approach on a series of synthetic and real data examples.

* 24 pages, 12 figures

Via

Access Paper or Ask Questions

Efficient Radiation Treatment Planning based on Voxel Importance

May 06, 2024

Sebastian Mair, Anqi Fu, Jens Sjölund

Abstract:Optimization is a time-consuming part of radiation treatment planning. We propose to reduce the optimization problem by only using a representative subset of informative voxels. This way, we improve planning efficiency while maintaining or enhancing the plan quality. To reduce the computational complexity of the optimization problem, we propose to subsample the set of voxels via importance sampling. We derive a sampling distribution based on an importance score that we obtain from pre-solving an easy optimization problem involving a simplified probing objective. By solving a reduced version of the original optimization problem using this subset, we effectively reduce the problem's size and computational demands while accounting for regions in which satisfactory dose deliveries are challenging. In contrast to other stochastic (sub-)sampling methods, our technique only requires a single sampling step to define a reduced optimization problem. This problem can be efficiently solved using established solvers. Empirical experiments on open benchmark data highlight substantially reduced optimization times, up to 50 times faster than the original ones, for intensity-modulated radiation therapy (IMRT), all while upholding plan quality comparable to traditional methods. Our approach has the potential to significantly accelerate radiation treatment planning by addressing its inherent computational challenges. We reduce the treatment planning time by reducing the size of the optimization problem rather than improving the optimization method. Our efforts are thus complementary to much of the previous developments.

* 20 pages, 11 figures

Via

Access Paper or Ask Questions