Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yoshua Bengio

DIRO

A cry for help: Early detection of brain injury in newborns

Oct 13, 2023

Charles C. Onu, Samantha Latremouille, Arsenii Gorin, Junhao Wang, Uchenna Ekwochi, Peter O. Ubuane, Omolara A. Kehinde, Muhammad A. Salisu, Datonye Briggs, Yoshua Bengio(+1 more)

Figure 1 for A cry for help: Early detection of brain injury in newborns

Figure 2 for A cry for help: Early detection of brain injury in newborns

Figure 3 for A cry for help: Early detection of brain injury in newborns

Figure 4 for A cry for help: Early detection of brain injury in newborns

Abstract:Since the 1960s, neonatal clinicians have known that newborns suffering from certain neurological conditions exhibit altered crying patterns such as the high-pitched cry in birth asphyxia. Despite an annual burden of over 1.5 million infant deaths and disabilities, early detection of neonatal brain injuries due to asphyxia remains a challenge, particularly in developing countries where the majority of births are not attended by a trained physician. Here we report on the first inter-continental clinical study to demonstrate that neonatal brain injury can be reliably determined from recorded infant cries using an AI algorithm we call Roseline. Previous and recent work has been limited by the lack of a large, high-quality clinical database of cry recordings, constraining the application of state-of-the-art machine learning. We develop a new training methodology for audio-based pathology detection models and evaluate this system on a large database of newborn cry sounds acquired from geographically diverse settings -- 5 hospitals across 3 continents. Our system extracts interpretable acoustic biomarkers that support clinical decisions and is able to accurately detect neurological injury from newborns' cries with an AUC of 92.5% (88.7% sensitivity at 80% specificity). Cry-based neurological monitoring opens the door for low-cost, easy-to-use, non-invasive and contact-free screening of at-risk babies, especially when integrated into simple devices like smartphones or neonatal ICU monitors. This would provide a reliable tool where there are no alternatives, but also curtail the need to regularly exert newborns to physically-exhausting or radiation-exposing assessments such as brain CT scans. This work sets the stage for embracing the infant cry as a vital sign and indicates the potential of AI-driven sound monitoring for the future of affordable healthcare.

Via

Access Paper or Ask Questions

PhyloGFN: Phylogenetic inference with generative flow networks

Oct 12, 2023

Mingyang Zhou, Zichao Yan, Elliot Layne, Nikolay Malkin, Dinghuai Zhang, Moksh Jain, Mathieu Blanchette, Yoshua Bengio

Abstract:Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history and numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and Bayesian phylogenetic inference. Because GFlowNets are well-suited for sampling complex combinatorial structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies and evolutionary distances. We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets. PhyloGFN is competitive with prior works in marginal likelihood estimation and achieves a closer fit to the target distribution than state-of-the-art variational inference methods.

Via

Access Paper or Ask Questions

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

Oct 10, 2023

Alvaro Carbonero, Alexandre Duval, Victor Schmidt, Santiago Miret, Alex Hernandez-Garcia, Yoshua Bengio, David Rolnick

Abstract:The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorporate the geometric configuration of all atoms. However, in practice not all this information may be readily available, e.g.~when evaluating the potentially unknown binding of adsorbates to catalyst. In this paper, we investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate with respect to the electro-catalyst. We consider SchNet, DimeNet++ and FAENet as base architectures and measure the impact of four modifications on model performance: removing edges in the input graph, pooling independent representations, not sharing the backbone weights and using an attention mechanism to propagate non-geometric relative information. We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE. Our work suggests future research directions in accelerated materials discovery where information on reactant configurations can be reduced or altogether omitted.

Via

Access Paper or Ask Questions

Crystal-GFN: sampling crystals with desirable properties and constraints

Oct 07, 2023

Mila AI4Science, Alex Hernandez-Garcia, Alexandre Duval, Alexandra Volokhova, Yoshua Bengio, Divya Sharma, Pierre Luc Carrier, Michał Koziarski, Victor Schmidt

Abstract:Accelerating material discovery holds the potential to greatly help mitigate the climate crisis. Discovering new solid-state crystals such as electrocatalysts, ionic conductors or photovoltaics can have a crucial impact, for instance, in improving the efficiency of renewable energy production and storage. In this paper, we introduce Crystal-GFlowNet, a generative model of crystal structures that sequentially samples a crystal's composition, space group and lattice parameters. This domain-inspired approach enables the flexible incorporation of physical and geometrical constraints, as well as the use of any available predictive model of a desired property as an objective function. We evaluate the capabilities of Crystal-GFlowNet by using as objective the formation energy of a crystal structure, as predicted by a new proxy model trained on MatBench. The results demonstrate that Crystal-GFlowNet is able to sample diverse crystals with low formation energy.

* Main paper (9 pages) + references + appendix

Via

Access Paper or Ask Questions

Amortizing intractable inference in large language models

Oct 06, 2023

Edward J. Hu, Moksh Jain, Eric Elmoznino, Younesse Kaddar, Guillaume Lajoie, Yoshua Bengio, Nikolay Malkin

Figure 1 for Amortizing intractable inference in large language models

Figure 2 for Amortizing intractable inference in large language models

Figure 3 for Amortizing intractable inference in large language models

Figure 4 for Amortizing intractable inference in large language models

Abstract:Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This limits tractable querying of this knowledge to start-to-end autoregressive sampling. However, many tasks of interest -- including sequence continuation, infilling, and other forms of constrained generation -- involve sampling from intractable posterior distributions. We address this limitation by using amortized Bayesian inference to sample from these intractable posteriors. Such amortization is algorithmically achieved by fine-tuning LLMs via diversity-seeking reinforcement learning algorithms: generative flow networks (GFlowNets). We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training and reward-maximizing policy optimization. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem and demonstrate that our approach enables data-efficient adaptation of LLMs to tasks that require multi-step rationalization and tool use.

* 23 pages; code: https://github.com/GFNOrg/gfn-lm-tuning

Via

Access Paper or Ask Questions

Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

Oct 05, 2023

Trang Nguyen, Alexander Tong, Kanika Madan, Yoshua Bengio, Dianbo Liu

Figure 1 for Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

Figure 2 for Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

Figure 3 for Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

Abstract:Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular processes. However, causal discovery in GRNs is a challenging problem for multiple reasons including the existence of cyclic feedback loops and uncertainty that yields diverse possible causal structures. Previous works in this area either ignore cyclic dynamics (assume acyclic structure) or struggle with scalability. We introduce Swift-DynGFN as a novel framework that enhances causal structure learning in GRNs while addressing scalability concerns. Specifically, Swift-DynGFN exploits gene-wise independence to boost parallelization and to lower computational cost. Experiments on real single-cell RNA velocity and synthetic GRN datasets showcase the advancement in learning causal structure in GRNs and scalability in larger systems.

Via

Access Paper or Ask Questions

Pre-Training and Fine-Tuning Generative Flow Networks

Oct 05, 2023

Ling Pan, Moksh Jain, Kanika Madan, Yoshua Bengio

Figure 1 for Pre-Training and Fine-Tuning Generative Flow Networks

Figure 2 for Pre-Training and Fine-Tuning Generative Flow Networks

Figure 3 for Pre-Training and Fine-Tuning Generative Flow Networks

Figure 4 for Pre-Training and Fine-Tuning Generative Flow Networks

Abstract:Generative Flow Networks (GFlowNets) are amortized samplers that learn stochastic policies to sequentially generate compositional objects from a given unnormalized reward distribution. They can generate diverse sets of high-reward objects, which is an important consideration in scientific discovery tasks. However, as they are typically trained from a given extrinsic reward function, it remains an important open challenge about how to leverage the power of pre-training and train GFlowNets in an unsupervised fashion for efficient adaptation to downstream tasks. Inspired by recent successes of unsupervised pre-training in various domains, we introduce a novel approach for reward-free pre-training of GFlowNets. By framing the training as a self-supervised problem, we propose an outcome-conditioned GFlowNet (OC-GFN) that learns to explore the candidate space. Specifically, OC-GFN learns to reach any targeted outcomes, akin to goal-conditioned policies in reinforcement learning. We show that the pre-trained OC-GFN model can allow for a direct extraction of a policy capable of sampling from any new reward functions in downstream tasks. Nonetheless, adapting OC-GFN on a downstream task-specific reward involves an intractable marginalization over possible outcomes. We propose a novel way to approximate this marginalization by learning an amortized predictor enabling efficient fine-tuning. Extensive experimental results validate the efficacy of our approach, demonstrating the effectiveness of pre-training the OC-GFN, and its ability to swiftly adapt to downstream tasks and discover modes more efficiently. This work may serve as a foundation for further exploration of pre-training strategies in the context of GFlowNets.

Via

Access Paper or Ask Questions

Expected flow networks in stochastic environments and two-player zero-sum games

Oct 04, 2023

Marco Jiralerspong, Bilun Sun, Danilo Vucetic, Tianyu Zhang, Yoshua Bengio, Gauthier Gidel, Nikolay Malkin

Figure 1 for Expected flow networks in stochastic environments and two-player zero-sum games

Figure 2 for Expected flow networks in stochastic environments and two-player zero-sum games

Figure 3 for Expected flow networks in stochastic environments and two-player zero-sum games

Figure 4 for Expected flow networks in stochastic environments and two-player zero-sum games

Abstract:Generative flow networks (GFlowNets) are sequential sampling models trained to match a given distribution. GFlowNets have been successfully applied to various structured object generation tasks, sampling a diverse set of high-reward objects quickly. We propose expected flow networks (EFlowNets), which extend GFlowNets to stochastic environments. We show that EFlowNets outperform other GFlowNet formulations in stochastic tasks such as protein design. We then extend the concept of EFlowNets to adversarial environments, proposing adversarial flow networks (AFlowNets) for two-player zero-sum games. We show that AFlowNets learn to find above 80% of optimal moves in Connect-4 via self-play and outperform AlphaZero in tournaments.

Via

Access Paper or Ask Questions

Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

Oct 04, 2023

Dinghuai Zhang, Ricky Tian Qi Chen, Cheng-Hao Liu, Aaron Courville, Yoshua Bengio

Figure 1 for Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

Figure 2 for Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

Figure 3 for Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

Figure 4 for Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

Abstract:We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics. We extend recent sampling-based approaches that leverage controlled stochastic processes to model approximate samples from these target densities. The main drawback of these approaches is that the training objective requires full trajectories to compute, resulting in sluggish credit assignment issues due to use of entire trajectories and a learning signal present only at the terminal time. In this work, we present Diffusion Generative Flow Samplers (DGFS), a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments, via parameterizing an additional "flow function". Our method takes inspiration from the theory developed for generative flow networks (GFlowNets), allowing us to make use of intermediate learning signals and benefit from off-policy exploration capabilities. Through a variety of challenging experiments, we demonstrate that DGFS results in more accurate estimates of the normalization constant than closely-related prior methods.

Via

Access Paper or Ask Questions

Local Search GFlowNets

Oct 04, 2023

Minsu Kim, Taeyoung Yun, Emmanuel Bengio, Dinghuai Zhang, Yoshua Bengio, Sungsoo Ahn, Jinkyoo Park

Abstract:Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their rewards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search which focuses on exploiting high rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via destruction and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: \url{https://github.com/dbsxodud-11/ls_gfn}.

* 18 pages, 17 figures

Via

Access Paper or Ask Questions