Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Julien Martinelli

IRS

In-Context Learning for Latent Space Bayesian Optimization

Jun 08, 2026

Tuan A. Vu, Harri Lähdesmäki, Julien Martinelli

Abstract:Bayesian optimization (BO) is a central tool for sample-efficient design, and latent-space Bayesian optimization (LSBO) extends it to structured objects such as molecules and proteins. In parallel, tabular foundation models such as TabPFN and TabICL now achieve state-of-the-art regression performance and are increasingly used as BO surrogates. Because their Bayesian behavior is induced by large synthetic pretraining collections, the composition of this pretraining distribution is crucial. LSBO creates a distinctive mismatch: the induced map from latent code to objective value differs markedly from the regression tasks used to train current in-context models. We address this mismatch by complementing the pretraining stage of tabular foundation model surrogates with synthetic optimization tasks defined on the latent space of a molecular VAE. The continued-pretraining objective features a regularizer that anchors the model to the original checkpoint, preserving its broad regression prior while avoiding overspecialization to the adaptation tasks. On held-out molecular optimization benchmarks, the resulting model achieves strong performance, supporting the relevance of LSBO-specific adaptation for in-context surrogates.

Via

Access Paper or Ask Questions

Bayesian Nonparametric Mixed-Effect ODEs with Gaussian Processes

May 13, 2026

Julien Martinelli, Maksim Sinelnikov, Harri Lähdesmäki, Quentin Clairon, Mélanie Prague

Abstract:Dynamical modelling is central to many scientific domains, including pharmacometrics, systems biology, physiology, and epidemiology. In these settings, heterogeneity is often intrinsic: different subjects or units follow related but distinct continuous-time dynamics. Classical nonlinear mixed-effects Ordinary Differential Equation (ODE) models address this by combining population-level structure with subject-specific effects, but they rely on a parametric vector field and are therefore vulnerable to structural misspecification and unmodelled mechanisms. This motivates nonparametric approaches that can retain principled uncertainty quantification, yet existing nonparametric ODE methods typically assume a single shared dynamical system rather than an explicit mixed-effect hierarchy over subject-specific dynamics. We propose MEGPODE, a Bayesian nonparametric mixed-effect ODE model in which each subject's vector field is decomposed into a shared population component and a subject-specific deviation, both endowed with Gaussian process (GP) priors. To avoid repeated ODE solves per subject during training, we combine state-space GP trajectory priors with virtual collocation observations, yielding Kalman-smoothing trajectory updates and closed-form regressions for the vector fields. Across controlled heterogeneous ODE benchmarks spanning oscillatory, biomedical systems, MEGPODE improves population-field recovery and subject-level trajectory prediction relative to strong baselines.

Via

Access Paper or Ask Questions

Online Sharp-Calibrated Bayesian Optimization

May 11, 2026

Marshal Arijona Sinaga, Julien Martinelli, Teemu Turpeinen, Samuel Kaski

Abstract:Bayesian optimization (BO) is a widely used framework for optimizing expensive black-box functions, commonly based on Gaussian process (GP) surrogate models. Its effectiveness relies on uncertainty quantification that is both sharp (informative) and well-calibrated along the BO trajectory. In practice, GP kernel hyperparameters are unknown and are refit online from sequentially collected (non-i.i.d.) data, which can yield miscalibrated or overly conservative uncertainty and lies outside the fixed-kernel assumptions of standard BO regret theory. We propose Online Sharp-Calibrated Bayesian Optimization (OSCBO), a BO algorithm that adaptively balances GP sharpness and calibration by casting hyperparameter selection as a constrained online-learning problem. We also show that OSCBO preserves sublinear regret bounds by leveraging the theoretical guarantees of the underlying online learning algorithm. Empirically, OSCBO performs competitively across synthetic and real-world benchmarks, ranking among the strongest methods in final simple regret while maintaining robust cumulative-regret behavior.

Via

Access Paper or Ask Questions

In-Context Black-Box Optimization with Unreliable Feedback

May 07, 2026

Nicolas Samuel Blumer, Julien Martinelli, Samuel Kaski

Abstract:Black-box optimization in science and engineering often comes with side information: experts, simulators, pretrained predictors, or heuristics can suggest which candidates look promising. This information can accelerate search, but it can also be biased, input-dependent, or misleading. Feedback-aware BO methods typically handle one task at a time, limiting their ability to generalize over multiple sources of feedback. In-context optimizers address cross-task adaptation, but usually assume that optimization history is the only available signal at test time. We study feedback-informed in-context black-box optimization (FICBO), where a pretrained optimizer conditions on both the observed history and cheap auxiliary feedback for the current candidate set. We introduce a structured feedback prior that models how feedback sources vary in their access, relevance, and distortion relative to the true objective, and use it to pretrain a feedback-aware transformer. At test time, the model estimates source reliability in context by comparing observed objective values with auxiliary signals, improving query selection. On synthetic and real-world tasks, FICBO effectively exploits informative feedback while remaining robust to weak or misleading sources, improving over other baselines. Empirical investigations further illustrate how the model perceives test-time sources, offering insights into its interpretability and decision-making process.

Via

Access Paper or Ask Questions

Time-Aware Latent Space Bayesian Optimization

Mar 01, 2026

Tuan A. Vu, Julien Martinelli, Harri Lähdesmäki

Abstract:Latent-space Bayesian optimization (LSBO) extends Bayesian optimization to structured domains, such as molecular design, by searching in the continuous latent space of a generative model. However, most LSBO methods assume a fixed objective, whereas real design campaigns often face temporal drift (e.g., evolving preferences or shifting targets). Bringing time-varying BO into LSBO is nontrivial: drift can affect not only the surrogate, but also the latent search space geometry induced by the representation. We propose Time-Aware Latent-space Bayesian Optimization (TALBO), which incorporates time in both the surrogate and the learned generative representation via a GP-prior variational autoencoder, yielding a latent space aligned as objectives evolve. To evaluate timevarying LSBO systematically, we adapt widely used molecular design tasks to drifting multi-property objectives and introduce metrics tailored to changing targets. Across these benchmarks, TALBO consistently outperforms strong LSBO baselines and remains robust across drift speeds and design choices, while remaining competitive under actually time-invariant objectives.

Via

Access Paper or Ask Questions

In-Context Multi-Objective Optimization

Dec 11, 2025

Xinyu Zhang, Conor Hassan, Julien Martinelli, Daolang Huang, Samuel Kaski

Abstract:Balancing competing objectives is omnipresent across disciplines, from drug design to autonomous systems. Multi-objective Bayesian optimization is a promising solution for such expensive, black-box problems: it fits probabilistic surrogates and selects new designs via an acquisition function that balances exploration and exploitation. In practice, it requires tailored choices of surrogate and acquisition that rarely transfer to the next problem, is myopic when multi-step planning is often required, and adds refitting overhead, particularly in parallel or time-sensitive loops. We present TAMO, a fully amortized, universal policy for multi-objective black-box optimization. TAMO uses a transformer architecture that operates across varying input and objective dimensions, enabling pretraining on diverse corpora and transfer to new problems without retraining: at test time, the pretrained model proposes the next design with a single forward pass. We pretrain the policy with reinforcement learning to maximize cumulative hypervolume improvement over full trajectories, conditioning on the entire query history to approximate the Pareto frontier. Across synthetic benchmarks and real tasks, TAMO produces fast proposals, reducing proposal time by 50-1000x versus alternatives while matching or improving Pareto quality under tight evaluation budgets. These results show that transformers can perform multi-objective optimization entirely in-context, eliminating per-task surrogate fitting and acquisition engineering, and open a path to foundation-style, plug-and-play optimizers for scientific discovery workflows.

Via

Access Paper or Ask Questions

Robust and Computation-Aware Gaussian Processes

May 27, 2025

Marshal Arijona Sinaga, Julien Martinelli, Samuel Kaski

Abstract:Gaussian processes (GPs) are widely used for regression and optimization tasks such as Bayesian optimization (BO) due to their expressiveness and principled uncertainty estimates. However, in settings with large datasets corrupted by outliers, standard GPs and their sparse approximations struggle with computational tractability and robustness. We introduce Robust Computation-aware Gaussian Process (RCaGP), a novel GP model that jointly addresses these challenges by combining a principled treatment of approximation-induced uncertainty with robust generalized Bayesian updating. The key insight is that robustness and approximation-awareness are not orthogonal but intertwined: approximations can exacerbate the impact of outliers, and mitigating one without the other is insufficient. Unlike previous work that focuses narrowly on either robustness or approximation quality, RCaGP combines both in a principled and scalable framework, thus effectively managing both outliers and computational uncertainties introduced by approximations such as low-rank matrix multiplications. Our model ensures more conservative and reliable uncertainty estimates, a property we rigorously demonstrate. Additionally, we establish a robustness property and show that the mean function is key to preserving it, motivating a tailored model selection scheme for robust mean functions. Empirical results confirm that solving these challenges jointly leads to superior performance across both clean and outlier-contaminated settings, both on regression and high-throughput Bayesian optimization benchmarks.

Via

Access Paper or Ask Questions

Challenges in interpretability of additive models

Apr 14, 2025

Xinyu Zhang, Julien Martinelli, ST John

Figure 1 for Challenges in interpretability of additive models

Figure 2 for Challenges in interpretability of additive models

Figure 3 for Challenges in interpretability of additive models

Abstract:We review generalized additive models as a type of ``transparent'' model that has recently seen renewed interest in the deep learning community as neural additive models. We highlight multiple types of nonidentifiability in this model class and discuss challenges in interpretability, arguing for restraint when claiming ``interpretability'' or ``suitability for safety-critical applications'' of such models.

* XAI-IJCAI24: Explainable AI workshop @ IJCAI 2024

Via

Access Paper or Ask Questions

PABBO: Preferential Amortized Black-Box Optimization

Mar 02, 2025

Xinyu Zhang, Daolang Huang, Samuel Kaski, Julien Martinelli

Abstract:Preferential Bayesian Optimization (PBO) is a sample-efficient method to learn latent user utilities from preferential feedback over a pair of designs. It relies on a statistical surrogate model for the latent function, usually a Gaussian process, and an acquisition strategy to select the next candidate pair to get user feedback on. Due to the non-conjugacy of the associated likelihood, every PBO step requires a significant amount of computations with various approximate inference techniques. This computational overhead is incompatible with the way humans interact with computers, hindering the use of PBO in real-world cases. Building on the recent advances of amortized BO, we propose to circumvent this issue by fully amortizing PBO, meta-learning both the surrogate and the acquisition function. Our method comprises a novel transformer neural process architecture, trained using reinforcement learning and tailored auxiliary losses. On a benchmark composed of synthetic and real-world datasets, our method is several orders of magnitude faster than the usual Gaussian process-based strategies and often outperforms them in accuracy.

* 25 pages, 17 figures. Accepted at the Thirteenth International Conference on Learning Representations (ICLR 2025)

Via

Access Paper or Ask Questions

Proxy-informed Bayesian transfer learning with unknown sources

Nov 05, 2024

Sabina J. Sloman, Julien Martinelli, Samuel Kaski

Figure 1 for Proxy-informed Bayesian transfer learning with unknown sources

Figure 2 for Proxy-informed Bayesian transfer learning with unknown sources

Figure 3 for Proxy-informed Bayesian transfer learning with unknown sources

Figure 4 for Proxy-informed Bayesian transfer learning with unknown sources

Abstract:Generalization outside the scope of one's training data requires leveraging prior knowledge about the effects that transfer, and the effects that don't, between different data sources. Bayesian transfer learning is a principled paradigm for specifying this knowledge, and refining it on the basis of data from the source (training) and target (prediction) tasks. We address the challenging transfer learning setting where the learner (i) cannot fine-tune in the target task, and (ii) does not know which source data points correspond to the same task (i.e., the data sources are unknown). We propose a proxy-informed robust method for probabilistic transfer learning (PROMPT), which provides a posterior predictive estimate tailored to the structure of the target task, without requiring the learner have access to any outcome information from the target task. Instead, PROMPT relies on the availability of proxy information. PROMPT uses the same proxy information for two purposes: (i) estimation of effects specific to the target task, and (ii) construction of a robust reweighting of the source data for estimation of effects that transfer between tasks. We provide theoretical results on the effect of this reweighting on the risk of negative transfer, and demonstrate application of PROMPT in two synthetic settings.

Via

Access Paper or Ask Questions