Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Laurent Charlin

University of Toronto

Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline

May 28, 2022

Massimo Caccia, Jonas Mueller, Taesup Kim, Laurent Charlin, Rasool Fakoor

Figure 1 for Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline

Figure 2 for Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline

Figure 3 for Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline

Figure 4 for Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline

Abstract:We study task-agnostic continual reinforcement learning (TACRL) in which standard RL challenges are compounded with partial observability stemming from task agnosticism, as well as additional difficulties of continual learning (CL), i.e., learning on a non-stationary sequence of tasks. Here we compare TACRL methods with their soft upper bounds prescribed by previous literature: multi-task learning (MTL) methods which do not have to deal with non-stationary data distributions, as well as task-aware methods, which are allowed to operate under full observability. We consider a previously unexplored and straightforward baseline for TACRL, replay-based recurrent RL (3RL), in which we augment an RL algorithm with recurrent mechanisms to address partial observability and experience replay mechanisms to address catastrophic forgetting in CL. Studying empirical performance in a sequence of RL tasks, we find surprising occurrences of 3RL matching and overcoming the MTL and task-aware soft upper bounds. We lay out hypotheses that could explain this inflection point of continual and task-agnostic learning research. Our hypotheses are empirically tested in continuous control tasks via a large-scale study of the popular multi-task and continual learning benchmark Meta-World. By analyzing different training statistics including gradient conflict, we find evidence that 3RL's outperformance stems from its ability to quickly infer how new tasks relate with the previous ones, enabling forward transfer.

Via

Access Paper or Ask Questions

Foundational Models for Continual Learning: An Empirical Study of Latent Replay

Apr 30, 2022

Oleksiy Ostapenko, Timothee Lesort, Pau Rodríguez, Md Rifat Arefin, Arthur Douillard, Irina Rish, Laurent Charlin

Figure 1 for Foundational Models for Continual Learning: An Empirical Study of Latent Replay

Figure 2 for Foundational Models for Continual Learning: An Empirical Study of Latent Replay

Figure 3 for Foundational Models for Continual Learning: An Empirical Study of Latent Replay

Figure 4 for Foundational Models for Continual Learning: An Empirical Study of Latent Replay

Abstract:Rapid development of large-scale pre-training has resulted in foundation models that can act as effective feature extractors on a variety of downstream tasks and domains. Motivated by this, we study the efficacy of pre-trained vision models as a foundation for downstream continual learning (CL) scenarios. Our goal is twofold. First, we want to understand the compute-accuracy trade-off between CL in the raw-data space and in the latent space of pre-trained encoders. Second, we investigate how the characteristics of the encoder, the pre-training algorithm and data, as well as of the resulting latent space affect CL performance. For this, we compare the efficacy of various pre-trained models in large-scale benchmarking scenarios with a vanilla replay setting applied in the latent and in the raw-data space. Notably, this study shows how transfer, forgetting, task similarity and learning are dependent on the input data characteristics and not necessarily on the CL algorithms. First, we show that under some circumstances reasonable CL performance can readily be achieved with a non-parametric classifier at negligible compute. We then show how models pre-trained on broader data result in better performance for various replay sizes. We explain this with representational similarity and transfer properties of these representations. Finally, we show the effectiveness of self-supervised pre-training for downstream domains that are out-of-distribution as compared to the pre-training domain. We point out and validate several research directions that can further increase the efficacy of latent CL including representation ensembling. The diverse set of datasets used in this study can serve as a compute-efficient playground for further CL research. The codebase is available under https://github.com/oleksost/latent_CL.

Via

Access Paper or Ask Questions

The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

Mar 17, 2022

Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M. Kazachkov, Elias Khalil(+31 more)

Figure 1 for The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

Figure 2 for The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

Figure 3 for The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

Abstract:Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning as a new approach for solving combinatorial problems, either directly as solvers or by enhancing exact solvers. Based on this context, the ML4CO aims at improving state-of-the-art combinatorial optimization solvers by replacing key heuristic components. The competition featured three challenging tasks: finding the best feasible solution, producing the tightest optimality certificate, and giving an appropriate solver configuration. Three realistic datasets were considered: balanced item placement, workload apportionment, and maritime inventory routing. This last dataset was kept anonymous for the contestants.

* Neurips 2021 competition. arXiv admin note: text overlap with arXiv:2112.12251 by other authors

Via

Access Paper or Ask Questions

A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions

Mar 03, 2022

Francois St-Hilaire, Dung Do Vu, Antoine Frau, Nathan Burns, Farid Faraji, Joseph Potochny, Stephane Robert, Arnaud Roussel, Selene Zheng, Taylor Glazier(+21 more)

Figure 1 for A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions

Figure 2 for A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions

Figure 3 for A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions

Figure 4 for A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions

Abstract:Despite artificial intelligence (AI) having transformed major aspects of our society, less than a fraction of its potential has been explored, let alone deployed, for education. AI-powered learning can provide millions of learners with a highly personalized, active and practical learning experience, which is key to successful learning. This is especially relevant in the context of online learning platforms. In this paper, we present the results of a comparative head-to-head study on learning outcomes for two popular online learning platforms (n=199 participants): A MOOC platform following a traditional model delivering content using lecture videos and multiple-choice quizzes, and the Korbit learning platform providing a highly personalized, active and practical learning experience. We observe a huge and statistically significant increase in the learning outcomes, with students on the Korbit platform providing full feedback resulting in higher course completion rates and achieving learning gains 2 to 2.5 times higher than both students on the MOOC platform and students in a control group who don't receive personalized feedback on the Korbit platform. The results demonstrate the tremendous impact that can be achieved with a personalized, active learning AI-powered system. Making this technology and learning experience available to millions of learners around the world will represent a significant leap forward towards the democratization of education.

* 9 pages, 6 figures

Via

Access Paper or Ask Questions

Continual Learning via Local Module Composition

Nov 15, 2021

Oleksiy Ostapenko, Pau Rodriguez, Massimo Caccia, Laurent Charlin

Figure 1 for Continual Learning via Local Module Composition

Figure 2 for Continual Learning via Local Module Composition

Figure 3 for Continual Learning via Local Module Composition

Figure 4 for Continual Learning via Local Module Composition

Abstract:Modularity is a compelling solution to continual learning (CL), the problem of modeling sequences of related tasks. Learning and then composing modules to solve different tasks provides an abstraction to address the principal challenges of CL including catastrophic forgetting, backward and forward transfer across tasks, and sub-linear model growth. We introduce local module composition (LMC), an approach to modular CL where each module is provided a local structural component that estimates a module's relevance to the input. Dynamic module composition is performed layer-wise based on local relevance scores. We demonstrate that agnosticity to task identities (IDs) arises from (local) structural learning that is module-specific as opposed to the task- and/or model-specific as in previous works, making LMC applicable to more CL settings compared to previous works. In addition, LMC also tracks statistics about the input distribution and adds new modules when outlier samples are detected. In the first set of experiments, LMC performs favorably compared to existing methods on the recent Continual Transfer-learning Benchmark without requiring task identities. In another study, we show that the locality of structural learning allows LMC to interpolate to related but unseen tasks (OOD), as well as to compose modular networks trained independently on different task sequences into a third modular network without any fine-tuning. Finally, in search for limitations of LMC we study it on more challenging sequences of 30 and 100 tasks, demonstrating that local module selection becomes much more challenging in presence of a large number of candidate modules. In this setting best performing LMC spawns much fewer modules compared to an oracle based baseline, however, it reaches a lower overall accuracy. The codebase is available under https://github.com/oleksost/LMC.

* NeurIPS 2021

Via

Access Paper or Ask Questions

Sequoia: A Software Framework to Unify Continual Learning Research

Aug 03, 2021

Fabrice Normandin, Florian Golemo, Oleksiy Ostapenko, Pau Rodriguez, Matthew D Riemer, Julio Hurtado, Khimya Khetarpal, Dominic Zhao, Ryan Lindeborg, Timothée Lesort(+3 more)

Figure 1 for Sequoia: A Software Framework to Unify Continual Learning Research

Figure 2 for Sequoia: A Software Framework to Unify Continual Learning Research

Figure 3 for Sequoia: A Software Framework to Unify Continual Learning Research

Figure 4 for Sequoia: A Software Framework to Unify Continual Learning Research

Abstract:The field of Continual Learning (CL) seeks to develop algorithms that accumulate knowledge and skills over time through interaction with non-stationary environments and data distributions. Measuring progress in CL can be difficult because a plethora of evaluation procedures (ettings) and algorithmic solutions (methods) have emerged, each with their own potentially disjoint set of assumptions about the CL problem. In this work, we view each setting as a set of assumptions. We then create a tree-shaped hierarchy of the research settings in CL, in which more general settings become the parents of those with more restrictive assumptions. This makes it possible to use inheritance to share and reuse research, as developing a method for a given setting also makes it directly applicable onto any of its children. We instantiate this idea as a publicly available software framework called Sequoia, which features a variety of settings from both the Continual Supervised Learning (CSL) and Continual Reinforcement Learning (CRL) domains. Sequoia also includes a growing suite of methods which are easy to extend and customize, in addition to more specialized methods from third-party libraries. We hope that this new paradigm and its first implementation can serve as a foundation for the unification and acceleration of research in CL. You can help us grow the tree by visiting www.github.com/lebrice/Sequoia.

Via

Access Paper or Ask Questions

Pretraining Representations for Data-Efficient Reinforcement Learning

Jun 09, 2021

Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville

Figure 1 for Pretraining Representations for Data-Efficient Reinforcement Learning

Figure 2 for Pretraining Representations for Data-Efficient Reinforcement Learning

Figure 3 for Pretraining Representations for Data-Efficient Reinforcement Learning

Figure 4 for Pretraining Representations for Data-Efficient Reinforcement Learning

Abstract:Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder which is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience), our approach significantly surpasses prior work combining offline representation pretraining with task-specific finetuning, and compares favourably with other pretraining methods that require orders of magnitude more data. Our approach shows particular promise when combined with larger models as well as more diverse, task-aligned observational data -- approaching human-level performance and data-efficiency on Atari in our best setting. We provide code associated with this work at https://github.com/mila-iqia/SGI.

Via

Access Paper or Ask Questions

Comparative Study of Learning Outcomes for Online Learning Platforms

Apr 15, 2021

Francois St-Hilaire, Nathan Burns, Robert Belfer, Muhammad Shayan, Ariella Smofsky, Dung Do Vu, Antoine Frau, Joseph Potochny, Farid Faraji, Vincent Pavero(+9 more)

Figure 1 for Comparative Study of Learning Outcomes for Online Learning Platforms

Figure 2 for Comparative Study of Learning Outcomes for Online Learning Platforms

Figure 3 for Comparative Study of Learning Outcomes for Online Learning Platforms

Figure 4 for Comparative Study of Learning Outcomes for Online Learning Platforms

Abstract:Personalization and active learning are key aspects to successful learning. These aspects are important to address in intelligent educational applications, as they help systems to adapt and close the gap between students with varying abilities, which becomes increasingly important in the context of online and distance learning. We run a comparative head-to-head study of learning outcomes for two popular online learning platforms: Platform A, which follows a traditional model delivering content over a series of lecture videos and multiple-choice quizzes, and Platform B, which creates a personalized learning environment and provides problem-solving exercises and personalized feedback. We report on the results of our study using pre- and post-assessment quizzes with participants taking courses on an introductory data science topic on two platforms. We observe a statistically significant increase in the learning outcomes on Platform B, highlighting the impact of well-designed and well-engineered technology supporting active learning and problem-based learning in online education. Moreover, the results of the self-assessment questionnaire, where participants reported on perceived learning gains, suggest that participants using Platform B improve their metacognition.

* 14 pages, 3 figures, 2 tables, accepted at AIED 2021 (2021 Conference on Artificial Intelligence in Education)

Via

Access Paper or Ask Questions

Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

Mar 18, 2021

Pau Rodriguez, Massimo Caccia, Alexandre Lacoste, Lee Zamparo, Issam Laradji, Laurent Charlin, David Vazquez

Figure 1 for Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

Figure 2 for Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

Figure 3 for Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

Figure 4 for Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

Abstract:Explainability for machine learning models has gained considerable attention within our research community given the importance of deploying more reliable machine-learning systems. In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction, providing details about the model's decision-making. Current counterfactual methods make ambiguous interpretations as they combine multiple biases of the model and the data in a single counterfactual interpretation of the model's decision. Moreover, these methods tend to generate trivial counterfactuals about the model's decision, as they often suggest to exaggerate or remove the presence of the attribute being classified. For the machine learning practitioner, these types of counterfactuals offer little value, since they provide no new information about undesired model or data biases. In this work, we propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss to uncover multiple valuable explanations about the model's prediction. Further, we introduce a mechanism to prevent the model from producing trivial explanations. Experiments on CelebA and Synbols demonstrate that our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods. We will publish the code.

Via

Access Paper or Ask Questions

Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

Oct 27, 2020

Yao Lu, Yue Dong, Laurent Charlin

Figure 1 for Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

Figure 2 for Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

Figure 3 for Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

Figure 4 for Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

Abstract:Multi-document summarization is a challenging task for which there exists little large-scale datasets. We propose Multi-XScience, a large-scale multi-document summarization dataset created from scientific articles. Multi-XScience introduces a challenging multi-document summarization task: writing the related-work section of a paper based on its abstract and the articles it references. Our work is inspired by extreme summarization, a dataset construction protocol that favours abstractive modeling approaches. Descriptive statistics and empirical results---using several state-of-the-art models trained on the Multi-XScience dataset---reveal that Multi-XScience is well suited for abstractive models.

* EMNLP 2020

Via

Access Paper or Ask Questions