Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexia Jolicoeur-Martineau

Probabilistic Tiny Recursive Model

May 19, 2026

Amin Sghaier, Ali Parviz, Alexia Jolicoeur-Martineau

Abstract:Tiny Recursive Models (TRM) solve complex reasoning tasks with a fraction of the parameters of modern large language models (LLMs) by iteratively refining a latent state and final answer. While powerful, their deterministic recursion can lead to convergence at suboptimal solutions, without escape mechanism. A common workaround relies on task-specific input perturbations at test time combined with answer aggregation via voting. We introduce Probabilistic TRM (PTRM), a task-agnostic framework for test-time compute scaling that addresses this limitation through stochastic exploration. PTRM injects Gaussian noise at each deep recursion step, enabling parallel trajectories to explore diverse solution basins, and selects among them using the model's existing Q head (used for early stopping in the original TRM). Without requiring retraining or task-specific augmentations, PTRM enables substantial accuracy gains across benchmarks, including Sudoku-Extreme (87.4% to 98.75%) and on various puzzles from Pencil Puzzle Bench (62.6% to 91.2%). On the latter, PTRM achieves nearly double the accuracy of frontier LLMs (91.2% vs. 55.1%) at less than 0.0001x the cost, using only 7M parameters.

Via

Access Paper or Ask Questions

On the Sample Efficiency of Inverse Dynamics Models for Semi-Supervised Imitation Learning

Feb 02, 2026

Sacha Morin, Moonsub Byeon, Alexia Jolicoeur-Martineau, Sébastien Lachapelle

Abstract:Semi-supervised imitation learning (SSIL) consists in learning a policy from a small dataset of action-labeled trajectories and a much larger dataset of action-free trajectories. Some SSIL methods learn an inverse dynamics model (IDM) to predict the action from the current state and the next state. An IDM can act as a policy when paired with a video model (VM-IDM) or as a label generator to perform behavior cloning on action-free data (IDM labeling). In this work, we first show that VM-IDM and IDM labeling learn the same policy in a limit case, which we call the IDM-based policy. We then argue that the previously observed advantage of IDM-based policies over behavior cloning is due to the superior sample efficiency of IDM learning, which we attribute to two causes: (i) the ground-truth IDM tends to be contained in a lower complexity hypothesis class relative to the expert policy, and (ii) the ground-truth IDM is often less stochastic than the expert policy. We argue these claims based on insights from statistical learning theory and novel experiments, including a study of IDM-based policies using recent architectures for unified video-action prediction (UVA). Motivated by these insights, we finally propose an improved version of the existing LAPO algorithm for latent action policy learning.

Via

Access Paper or Ask Questions

Less is More: Recursive Reasoning with Tiny Networks

Oct 06, 2025

Alexia Jolicoeur-Martineau

Abstract:Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on hard puzzle tasks such as Sudoku, Maze, and ARC-AGI while trained with small models (27M parameters) on small data (around 1000 examples). HRM holds great promise for solving hard problems with small networks, but it is not yet well understood and may be suboptimal. We propose Tiny Recursive Model (TRM), a much simpler recursive reasoning approach that achieves significantly higher generalization than HRM, while using a single tiny network with only 2 layers. With only 7M parameters, TRM obtains 45% test-accuracy on ARC-AGI-1 and 8% on ARC-AGI-2, higher than most LLMs (e.g., Deepseek R1, o3-mini, Gemini 2.5 Pro) with less than 0.01% of the parameters.

Via

Access Paper or Ask Questions

Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings

Aug 01, 2025

Alexia Jolicoeur-Martineau

Figure 1 for Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings

Figure 2 for Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings

Figure 3 for Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings

Figure 4 for Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings

Abstract:While AI excels at generating text, audio, images, and videos, creating interactive audio-visual content such as video games remains challenging. Current LLMs can generate JavaScript games and animations, but lack automated evaluation metrics and struggle with complex content that normally requires teams of humans working for many months (multi-shot, multi-agents) using assets made by artists. To tackle these issues, we built a new metric and a multi-agent system. We propose AVR-Eval, a relative metric for multimedia content quality using Audio-Visual Recordings (AVRs). An omni-modal model (processing text, video, and audio) compares the AVRs of two contents, with a text model reviewing evaluations to determine superiority. We show that AVR-Eval properly identifies good from broken or mismatched content. We built AVR-Agent, a multi-agent system generating JavaScript code from a bank of multimedia assets (audio, images, 3D models). The coding agent selects relevant assets, generates multiple initial codes, uses AVR-Eval to identify the best version, and iteratively improves it through omni-modal agent feedback from the AVR. We run experiments on games and animations with AVR-Eval (win rate of content A against B). We find that content generated by AVR-Agent has a significantly higher win rate against content made through one-shot generation. However, models struggle to leverage custom assets and AVR feedback effectively, showing no higher win rate. This reveals a critical gap: while humans benefit from high-quality assets and audio-visual feedback, current coding models do not seem to utilize these resources as effectively, highlighting fundamental differences between human and machine content creation approaches.

Via

Access Paper or Ask Questions

Generating $π$-Functional Molecules Using STGG+ with Active Learning

Feb 20, 2025

Alexia Jolicoeur-Martineau, Yan Zhang, Boris Knyazev, Aristide Baratin, Cheng-Hao Liu

Abstract:Generating novel molecules with out-of-distribution properties is a major challenge in molecular discovery. While supervised learning methods generate high-quality molecules similar to those in a dataset, they struggle to generalize to out-of-distribution properties. Reinforcement learning can explore new chemical spaces but often conducts 'reward-hacking' and generates non-synthesizable molecules. In this work, we address this problem by integrating a state-of-the-art supervised learning method, STGG+, in an active learning loop. Our approach iteratively generates, evaluates, and fine-tunes STGG+ to continuously expand its knowledge. We denote this approach STGG+AL. We apply STGG+AL to the design of organic $\pi$-functional materials, specifically two challenging tasks: 1) generating highly absorptive molecules characterized by high oscillator strength and 2) designing absorptive molecules with reasonable oscillator strength in the near-infrared (NIR) range. The generated molecules are validated and rationalized in-silico with time-dependent density functional theory. Our results demonstrate that our method is highly effective in generating novel molecules with high oscillator strength, contrary to existing methods such as reinforcement learning (RL) methods. We open-source our active-learning code along with our Conjugated-xTB dataset containing 2.9 million $\pi$-conjugated molecules and the function for approximating the oscillator strength and absorption wavelength (based on sTDA-xTB).

* Code: https://github.com/SamsungSAILMontreal/STGG-AL

Via

Access Paper or Ask Questions

Understanding Adam Requires Better Rotation Dependent Assumptions

Oct 25, 2024

Lucas Maes, Tianyue H. Zhang, Alexia Jolicoeur-Martineau, Ioannis Mitliagkas, Damien Scieur, Simon Lacoste-Julien, Charles Guille-Escuret

Figure 1 for Understanding Adam Requires Better Rotation Dependent Assumptions

Figure 2 for Understanding Adam Requires Better Rotation Dependent Assumptions

Figure 3 for Understanding Adam Requires Better Rotation Dependent Assumptions

Figure 4 for Understanding Adam Requires Better Rotation Dependent Assumptions

Abstract:Despite its widespread adoption, Adam's advantage over Stochastic Gradient Descent (SGD) lacks a comprehensive theoretical explanation. This paper investigates Adam's sensitivity to rotations of the parameter space. We demonstrate that Adam's performance in training transformers degrades under random rotations of the parameter space, indicating a crucial sensitivity to the choice of basis. This reveals that conventional rotation-invariant assumptions are insufficient to capture Adam's advantages theoretically. To better understand the rotation-dependent properties that benefit Adam, we also identify structured rotations that preserve or even enhance its empirical performance. We then examine the rotation-dependent assumptions in the literature, evaluating their adequacy in explaining Adam's behavior across various rotation types. This work highlights the need for new, rotation-dependent theoretical frameworks to fully understand Adam's empirical success in modern machine learning tasks.

Via

Access Paper or Ask Questions

Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching

Oct 20, 2024

Ange-Clément Akazan, Alexia Jolicoeur-Martineau, Ioannis Mitliagkas

Figure 1 for Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching

Figure 2 for Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching

Figure 3 for Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching

Figure 4 for Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching

Abstract:Privacy and regulatory constraints make data generation vital to advancing machine learning without relying on real-world datasets. A leading approach for tabular data generation is the Forest Flow (FF) method, which combines Flow Matching with XGBoost. Despite its good performance, FF is slow and makes errors when treating categorical variables as one-hot continuous features. It is also highly sensitive to small changes in the initial conditions of the ordinary differential equation (ODE). To overcome these limitations, we develop Heterogeneous Sequential Feature Forest Flow (HS3F). Our method generates data sequentially (feature-by-feature), reducing the dependency on noisy initial conditions through the additional information from previously generated features. Furthermore, it generates categorical variables using multinomial sampling (from an XGBoost classifier) instead of flow matching, improving generation speed. We also use a Runge-Kutta 4th order (Rg4) ODE solver for improved performance over the Euler solver used in FF. Our experiments with 25 datasets reveal that HS3F produces higher quality and more diverse synthetic data than FF, especially for categorical variables. It also generates data 21-27 times faster for datasets with $\geq20%$ categorical variables. HS3F further demonstrates enhanced robustness to affine transformation in flow ODE initial conditions compared to FF. This study not only validates the HS3F but also unveils promising new strategies to advance generative models.

Via

Access Paper or Ask Questions

Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality

Oct 07, 2024

Ge Ya, Luo, Gian Favero, Zhi Hao Luo, Alexia Jolicoeur-Martineau, Christopher Pal

Figure 1 for Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality

Figure 2 for Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality

Figure 3 for Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality

Figure 4 for Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality

Abstract:The Fr\'echet Video Distance (FVD) is a widely adopted metric for evaluating video generation distribution quality. However, its effectiveness relies on critical assumptions. Our analysis reveals three significant limitations: (1) the non-Gaussianity of the Inflated 3D Convnet (I3D) feature space; (2) the insensitivity of I3D features to temporal distortions; (3) the impractical sample sizes required for reliable estimation. These findings undermine FVD's reliability and show that FVD falls short as a standalone metric for video generation evaluation. After extensive analysis of a wide range of metrics and backbone architectures, we propose JEDi, the JEPA Embedding Distance, based on features derived from a Joint Embedding Predictive Architecture, measured using Maximum Mean Discrepancy with polynomial kernel. Our experiments on multiple open-source datasets show clear evidence that it is a superior alternative to the widely used FVD metric, requiring only 16% of the samples to reach its steady value, while increasing alignment with human evaluation by 34%, on average.

Via

Access Paper or Ask Questions

Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees

Jul 12, 2024

Alexia Jolicoeur-Martineau, Aristide Baratin, Kisoo Kwon, Boris Knyazev, Yan Zhang

Figure 1 for Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees

Figure 2 for Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees

Figure 3 for Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees

Figure 4 for Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees

Abstract:Generating novel molecules is challenging, with most representations leading to generative models producing many invalid molecules. Spanning Tree-based Graph Generation (STGG) is a promising approach to ensure the generation of valid molecules, outperforming state-of-the-art SMILES and graph diffusion models for unconditional generation. In the real world, we want to be able to generate molecules conditional on one or multiple desired properties rather than unconditionally. Thus, in this work, we extend STGG to multi-property-conditional generation. Our approach, STGG+, incorporates a modern Transformer architecture, random masking of properties during training (enabling conditioning on any subset of properties and classifier-free guidance), an auxiliary property-prediction loss (allowing the model to self-criticize molecules and select the best ones), and other improvements. We show that STGG+ achieves state-of-the-art performance on in-distribution and out-of-distribution conditional generation, and reward maximization.

Via

Access Paper or Ask Questions

Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion

Jun 09, 2024

Ge Ya Luo, Zhi Hao Luo, Anthony Gosselin, Alexia Jolicoeur-Martineau, Christopher Pal

Figure 1 for Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion

Figure 2 for Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion

Figure 3 for Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion

Figure 4 for Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion

Abstract:With recent advances in video prediction, controllable video generation has been attracting more attention. Generating high fidelity videos according to simple and flexible conditioning is of particular interest. To this end, we propose a controllable video generation model using pixel level renderings of 2D or 3D bounding boxes as conditioning. In addition, we also create a bounding box predictor that, given the initial and ending frames' bounding boxes, can predict up to 15 bounding boxes per frame for all the frames in a 25-frame clip. We perform experiments across 3 well-known AV video datasets: KITTI, Virtual-KITTI 2 and BDD100k.

Via

Access Paper or Ask Questions