Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arthur Renard

Cross-Entropy Games and Frost Training

May 26, 2026

Arthur Renard, Franck Gabriel, Valentin Hartmann, Clément Hongler

Abstract:We present Frost Training, a method for improving Monte Carlo-based policy optimization for a large family of LLM-as-a-judge tasks called Cross-Entropy Games. The key idea is to exploit the gradient of the reward function in embedding space. This signal is used in the Greedy Coordinate Gradient (GCG) jailbreaking technique; we demonstrate for the first time that it can also be used to boost model training. We validate our method using GRPO training for maximum-likelihood infilling. Frost Training improves the model's ability to generate high-scoring outputs, reaching higher maximum scores in a best-of-k setting, and does so at an increased speed.

* 14 pages, 6 figures

Via

Access Paper or Ask Questions

Cognitive Training for Language Models: Towards General Capabilities via Cross-Entropy Games

Mar 23, 2026

Clément Hongler, Franck Gabriel, Valentin Hartmann, Arthur Renard, Andrew Emil

Abstract:Defining a constructive process to build general capabilities for language models in an automatic manner is considered an open problem in artificial intelligence. Towards this, we consider the problem of building a curriculum of tasks that grows a model via relevant skill discovery. We provide a concrete framework for this task, using a family of tasks called cross-entropy games, which we postulate is universal in a suitable sense. We show that if it is possible to grow the curriculum for relevant skill discovery by iterating a greedy optimization algorithm, then, under natural assumptions, there is essentially only one meta-objective possible (up to a few hyperparameters). We call the resulting process cognitive training. We postulate that, given sufficiently capable language models as players and meta-samplers and sufficient training time, cognitive training provides a principled way to relevant skill discovery; and hence to the extent general capabilities are achievable via greedy curriculum learning, cognitive training would be a solution.

* 20 pages

Via

Access Paper or Ask Questions

Looking for Complexity at Phase Boundaries in Continuous Cellular Automata

Mar 08, 2024

Vassilis Papadopoulos, Guilhem Doat, Arthur Renard, Clément Hongler

Figure 1 for Looking for Complexity at Phase Boundaries in Continuous Cellular Automata

Figure 2 for Looking for Complexity at Phase Boundaries in Continuous Cellular Automata

Figure 3 for Looking for Complexity at Phase Boundaries in Continuous Cellular Automata

Figure 4 for Looking for Complexity at Phase Boundaries in Continuous Cellular Automata

Abstract:One key challenge in Artificial Life is designing systems that display an emergence of complex behaviors. Many such systems depend on a high-dimensional parameter space, only a small subset of which displays interesting dynamics. Focusing on the case of continuous systems, we introduce the 'Phase Transition Finder'(PTF) algorithm, which can be used to efficiently generate parameters lying at the border between two phases. We argue that such points are more likely to display complex behaviors, and confirm this by applying PTF to Lenia showing it can increase the frequency of interesting behaviors more than two-fold, while remaining efficient enough for large-scale searches.

* 5 pages

Via

Access Paper or Ask Questions