Alert button
Picture for Vincent François-Lavet

Vincent François-Lavet

Alert button

A Machine with Short-Term, Episodic, and Semantic Memory Systems

Dec 05, 2022
Taewoon Kim, Michael Cochez, Vincent François-Lavet, Mark Neerincx, Piek Vossen

Figure 1 for A Machine with Short-Term, Episodic, and Semantic Memory Systems
Figure 2 for A Machine with Short-Term, Episodic, and Semantic Memory Systems
Figure 3 for A Machine with Short-Term, Episodic, and Semantic Memory Systems
Figure 4 for A Machine with Short-Term, Episodic, and Semantic Memory Systems

Inspired by the cognitive science theory of the explicit human memory systems, we have modeled an agent with short-term, episodic, and semantic memory systems, each of which is modeled with a knowledge graph. To evaluate this system and analyze the behavior of this agent, we designed and released our own reinforcement learning agent environment, "the Room", where an agent has to learn how to encode, store, and retrieve memories to maximize its return by answering questions. We show that our deep Q-learning based agent successfully learns whether a short-term memory should be forgotten, or rather be stored in the episodic or semantic memory systems. Our experiments indicate that an agent with human-like memory systems can outperform an agent without this memory structure in the environment.

Viaarxiv icon

Disentangled (Un)Controllable Features

Oct 31, 2022
Jacob E. Kooi, Mark Hoogendoorn, Vincent François-Lavet

Figure 1 for Disentangled (Un)Controllable Features
Figure 2 for Disentangled (Un)Controllable Features
Figure 3 for Disentangled (Un)Controllable Features
Figure 4 for Disentangled (Un)Controllable Features

In the context of MDPs with high-dimensional states, reinforcement learning can achieve better results when using a compressed, low-dimensional representation of the original input space. A variety of learning objectives have therefore been used to learn useful representations. However, these representations usually lack interpretability of the different features. We propose a representation learning algorithm that is able to disentangle latent features into a controllable and an uncontrollable part. The resulting representations are easily interpretable and can be used for learning and planning efficiently by leveraging the specific properties of the two parts. To highlight the benefits of the approach, the disentangling properties of the algorithm are illustrated in three different environments.

* 14 pages (9 main paper pages), 9 figures 
Viaarxiv icon

A Meta-Reinforcement Learning Algorithm for Causal Discovery

Jul 18, 2022
Andreas Sauter, Erman Acar, Vincent François-Lavet

Figure 1 for A Meta-Reinforcement Learning Algorithm for Causal Discovery
Figure 2 for A Meta-Reinforcement Learning Algorithm for Causal Discovery
Figure 3 for A Meta-Reinforcement Learning Algorithm for Causal Discovery
Figure 4 for A Meta-Reinforcement Learning Algorithm for Causal Discovery

Causal discovery is a major task with the utmost importance for machine learning since causal structures can enable models to go beyond pure correlation-based inference and significantly boost their performance. However, finding causal structures from data poses a significant challenge both in computational effort and accuracy, let alone its impossibility without interventions in general. In this paper, we develop a meta-reinforcement learning algorithm that performs causal discovery by learning to perform interventions such that it can construct an explicit causal graph. Apart from being useful for possible downstream applications, the estimated causal graph also provides an explanation for the data-generating process. In this article, we show that our algorithm estimates a good graph compared to the SOTA approaches, even in environments whose underlying causal structure is previously unseen. Further, we make an ablation study that shows how learning interventions contribute to the overall performance of our approach. We conclude that interventions indeed help boost the performance, efficiently yielding an accurate estimate of the causal structure of a possibly unseen environment.

* Accepted submission to CRL@UAI 22 
Viaarxiv icon

Domain Adversarial Reinforcement Learning

Feb 14, 2021
Bonnie Li, Vincent François-Lavet, Thang Doan, Joelle Pineau

Figure 1 for Domain Adversarial Reinforcement Learning
Figure 2 for Domain Adversarial Reinforcement Learning
Figure 3 for Domain Adversarial Reinforcement Learning
Figure 4 for Domain Adversarial Reinforcement Learning

We consider the problem of generalization in reinforcement learning where visual aspects of the observations might differ, e.g. when there are different backgrounds or change in contrast, brightness, etc. We assume that our agent has access to only a few of the MDPs from the MDP distribution during training. The performance of the agent is then reported on new unknown test domains drawn from the distribution (e.g. unseen backgrounds). For this "zero-shot RL" task, we enforce invariance of the learned representations to visual domains via a domain adversarial optimization process. We empirically show that this approach allows achieving a significant generalization improvement to new unseen domains.

Viaarxiv icon

Novelty Search in Representational Space for Sample Efficient Exploration

Oct 21, 2020
Ruo Yu Tao, Vincent François-Lavet, Joelle Pineau

Figure 1 for Novelty Search in Representational Space for Sample Efficient Exploration
Figure 2 for Novelty Search in Representational Space for Sample Efficient Exploration
Figure 3 for Novelty Search in Representational Space for Sample Efficient Exploration
Figure 4 for Novelty Search in Representational Space for Sample Efficient Exploration

We present a new approach for efficient exploration which leverages a low-dimensional encoding of the environment learned with a combination of model-based and model-free objectives. Our approach uses intrinsic rewards that are based on the distance of nearest neighbors in the low dimensional representational space to gauge novelty. We then leverage these intrinsic rewards for sample-efficient exploration with planning routines in representational space for hard exploration tasks with sparse rewards. One key element of our approach is the use of information theoretic principles to shape our representations in a way so that our novelty reward goes beyond pixel similarity. We test our approach on a number of maze tasks, as well as a control problem and show that our exploration approach is more sample-efficient compared to strong baselines.

* 10 pages + references + appendix. Oral presentation at NeurIPS 2020 
Viaarxiv icon

Novelty Search in representational space for sample efficient exploration

Sep 28, 2020
Ruo Yu Tao, Vincent François-Lavet, Joelle Pineau

Figure 1 for Novelty Search in representational space for sample efficient exploration
Figure 2 for Novelty Search in representational space for sample efficient exploration
Figure 3 for Novelty Search in representational space for sample efficient exploration
Figure 4 for Novelty Search in representational space for sample efficient exploration

We present a new approach for efficient exploration which leverages a low-dimensional encoding of the environment learned with a combination of model-based and model-free objectives. Our approach uses intrinsic rewards that are based on the distance of nearest neighbors in the low dimensional representational space to gauge novelty. We then leverage these intrinsic rewards for sample-efficient exploration with planning routines in representational space for hard exploration tasks with sparse rewards. One key element of our approach is the use of information theoretic principles to shape our representations in a way so that our novelty reward goes beyond pixel similarity. We test our approach on a number of maze tasks, as well as a control problem and show that our exploration approach is more sample-efficient compared to strong baselines.

* 9 pages + references + appendix. Oral presentation at NeurIPS 2020 
Viaarxiv icon

Neural Architecture Search for Class-incremental Learning

Sep 14, 2019
Shenyang Huang, Vincent François-Lavet, Guillaume Rabusseau

Figure 1 for Neural Architecture Search for Class-incremental Learning
Figure 2 for Neural Architecture Search for Class-incremental Learning
Figure 3 for Neural Architecture Search for Class-incremental Learning
Figure 4 for Neural Architecture Search for Class-incremental Learning

In class-incremental learning, a model learns continuously from a sequential data stream in which new classes occur. Existing methods often rely on static architectures that are manually crafted. These methods can be prone to capacity saturation because a neural network's ability to generalize to new concepts is limited by its fixed capacity. To understand how to expand a continual learner, we focus on the neural architecture design problem in the context of class-incremental learning: at each time step, the learner must optimize its performance on all classes observed so far by selecting the most competitive neural architecture. To tackle this problem, we propose Continual Neural Architecture Search (CNAS): an autoML approach that takes advantage of the sequential nature of class-incremental learning to efficiently and adaptively identify strong architectures in a continual learning setting. We employ a task network to perform the classification task and a reinforcement learning agent as the meta-controller for architecture search. In addition, we apply network transformations to transfer weights from previous learning step and to reduce the size of the architecture search space, thus saving a large amount of computational resources. We evaluate CNAS on the CIFAR-100 dataset under varied incremental learning scenarios with limited computational power (1 GPU). Experimental results demonstrate that CNAS outperforms architectures that are optimized for the entire dataset. In addition, CNAS is at least an order of magnitude more efficient than naively using existing autoML methods.

* 8 pages, 10 Figures 
Viaarxiv icon

Combined Reinforcement Learning via Abstract Representations

Sep 12, 2018
Vincent François-Lavet, Yoshua Bengio, Doina Precup, Joelle Pineau

Figure 1 for Combined Reinforcement Learning via Abstract Representations
Figure 2 for Combined Reinforcement Learning via Abstract Representations
Figure 3 for Combined Reinforcement Learning via Abstract Representations
Figure 4 for Combined Reinforcement Learning via Abstract Representations

In the quest for efficient and robust reinforcement learning methods, both model-free and model-based approaches offer advantages. In this paper we propose a new way of explicitly bridging both approaches via a shared low-dimensional learned encoding of the environment, meant to capture summarizing abstractions. We show that the modularity brought by this approach leads to good generalization while being computationally efficient, with planning happening in a smaller latent state space. In addition, this approach recovers a sufficient low-dimensional representation of the environment, which opens up new strategies for interpretable AI, exploration and transfer learning.

Viaarxiv icon

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies

Jan 20, 2016
Vincent François-Lavet, Raphael Fonteneau, Damien Ernst

Figure 1 for How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
Figure 2 for How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
Figure 3 for How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
Figure 4 for How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies

Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as a benchmark, we discuss the role that the discount factor may play in the quality of the learning process of a deep Q-network (DQN). When the discount factor progressively increases up to its final value, we empirically show that it is possible to significantly reduce the number of learning steps. When used in conjunction with a varying learning rate, we empirically show that it outperforms original DQN on several experiments. We relate this phenomenon with the instabilities of neural networks when they are used in an approximate Dynamic Programming setting. We also describe the possibility to fall within a local optimum during the learning process, thus connecting our discussion with the exploration/exploitation dilemma.

* NIPS 2015 Deep Reinforcement Learning Workshop 
Viaarxiv icon

Simple connectome inference from partial correlation statistics in calcium imaging

Nov 18, 2014
Antonio Sutera, Arnaud Joly, Vincent François-Lavet, Zixiao Aaron Qiu, Gilles Louppe, Damien Ernst, Pierre Geurts

Figure 1 for Simple connectome inference from partial correlation statistics in calcium imaging
Figure 2 for Simple connectome inference from partial correlation statistics in calcium imaging
Figure 3 for Simple connectome inference from partial correlation statistics in calcium imaging
Figure 4 for Simple connectome inference from partial correlation statistics in calcium imaging

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to detect neural peak activities. Second, inferring the degree of association between neurons from partial correlation statistics. This paper summarises the methodology that led us to win the Connectomics Challenge, proposes a simplified version of our method, and finally compares our results with respect to other inference methods.

Viaarxiv icon