Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matías Carrasco

Concave Statistical Utility Maximization Bandits via Influence-Function Gradients

Apr 28, 2026

Matías Carrasco, Alejandro Cholaquidis

Abstract:We study stochastic multi-armed bandits in which the objective is a statistical functional of the long-run reward distribution, rather than expected reward alone. Under mild continuity assumptions, we show that the infinite-horizon problem reduces to optimizing over stationary mixed policies: each weight vector \(w\) on the simplex induces a mixture law \(P^w\), and performance is measured by the concave utility \(U(w)=\mathfrak U(P^w)\). For differentiable statistical utilities, we use influence-function calculus to derive stochastic gradient estimators from bandit feedback. This leads to an entropic mirror-ascent algorithm on a truncated simplex, implemented through multiplicative-weights updates and plug-in estimates of the influence function. We establish regret bounds that separate the mirror-ascent optimization error from the bias caused by estimating the influence function. The framework is developed for general concave distributional utilities and illustrated through variance and Wasserstein objectives, with numerical experiments comparing exact and plug-in influence-function implementations.

Via

Access Paper or Ask Questions

Congruence-based Learning of Probabilistic Deterministic Finite Automata

Dec 12, 2024

Matías Carrasco, Franz Mayr, Sergio Yovine

Abstract:This work studies the question of learning probabilistic deterministic automata from language models. For this purpose, it focuses on analyzing the relations defined on algebraic structures over strings by equivalences and similarities on probability distributions. We introduce a congruence that extends the classical Myhill-Nerode congruence for formal languages. This new congruence is the basis for defining regularity over language models. We present an active learning algorithm that computes the quotient with respect to this congruence whenever the language model is regular. The paper also defines the notion of recognizability for language models and shows that it coincides with regularity for congruences. For relations which are not congruences, it shows that this is not the case. Finally, it discusses the impact of this result on learning in the context of language models.

Via

Access Paper or Ask Questions

Analyzing constrained LLM through PDFA-learning

Jun 12, 2024

Matías Carrasco, Franz Mayr, Sergio Yovine, Johny Kidd, Martín Iturbide, Juan Pedro da Silva, Alejo Garat

Figure 1 for Analyzing constrained LLM through PDFA-learning

Figure 2 for Analyzing constrained LLM through PDFA-learning

Figure 3 for Analyzing constrained LLM through PDFA-learning

Figure 4 for Analyzing constrained LLM through PDFA-learning

Abstract:We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM.

* Workshop Paper

Via

Access Paper or Ask Questions