Alert button
Picture for Andreas Bender

Andreas Bender

Alert button

Deep Learning for Survival Analysis: A Review

May 24, 2023
Simon Wiegrebe, Philipp Kopper, Raphael Sonabend, Andreas Bender

Figure 1 for Deep Learning for Survival Analysis: A Review
Figure 2 for Deep Learning for Survival Analysis: A Review
Figure 3 for Deep Learning for Survival Analysis: A Review
Figure 4 for Deep Learning for Survival Analysis: A Review

The influx of deep learning (DL) techniques into the field of survival analysis in recent years, coupled with the increasing availability of high-dimensional omics data and unstructured data like images or text, has led to substantial methodological progress; for instance, learning from such high-dimensional or unstructured data. Numerous modern DL-based survival methods have been developed since the mid-2010s; however, they often address only a small subset of scenarios in the time-to-event data setting - e.g., single-risk right-censored survival tasks - and neglect to incorporate more complex (and common) settings. Partially, this is due to a lack of exchange between experts in the respective fields. In this work, we provide a comprehensive systematic review of DL-based methods for time-to-event analysis, characterizing them according to both survival- and DL-related attributes. In doing so, we hope to provide a helpful overview to practitioners who are interested in DL techniques applicable to their specific use case as well as to enable researchers from both fields to identify directions for future investigation. We provide a detailed characterization of the methods included in this review as an open-source, interactive table: https://survival-org.github.io/DL4Survival. As this research area is advancing rapidly, we encourage the research community to contribute to keeping the information up to date.

* 24 pages, 6 figures, 2 tables, 1 interactive table 
Viaarxiv icon

Conditional Neural Processes for Molecules

Oct 17, 2022
Miguel Garcia-Ortegon, Andreas Bender, Sergio Bacallado

Figure 1 for Conditional Neural Processes for Molecules
Figure 2 for Conditional Neural Processes for Molecules
Figure 3 for Conditional Neural Processes for Molecules
Figure 4 for Conditional Neural Processes for Molecules

Neural processes (NPs) are models for transfer learning with properties reminiscent of Gaussian Processes (GPs). They are adept at modelling data consisting of few observations of many related functions on the same input space and are trained by minimizing a variational objective, which is computationally much less expensive than the Bayesian updating required by GPs. So far, most studies of NPs have focused on low-dimensional datasets which are not representative of realistic transfer learning tasks. Drug discovery is one application area that is characterized by datasets consisting of many chemical properties or functions which are sparsely observed, yet depend on shared features or representations of the molecular inputs. This paper applies the conditional neural process (CNP) to DOCKSTRING, a dataset of docking scores for benchmarking ML models. CNPs show competitive performance in few-shot learning tasks relative to supervised learning baselines common in QSAR modelling, as well as an alternative model for transfer learning based on pre-training and refining neural network regressors. We present a Bayesian optimization experiment which showcases the probabilistic nature of CNPs and discuss shortcomings of the model in uncertainty quantification.

Viaarxiv icon

Heterogeneous Treatment Effect Estimation for Observational Data using Model-based Forests

Oct 06, 2022
Susanne Dandl, Andreas Bender, Torsten Hothorn

Figure 1 for Heterogeneous Treatment Effect Estimation for Observational Data using Model-based Forests
Figure 2 for Heterogeneous Treatment Effect Estimation for Observational Data using Model-based Forests
Figure 3 for Heterogeneous Treatment Effect Estimation for Observational Data using Model-based Forests
Figure 4 for Heterogeneous Treatment Effect Estimation for Observational Data using Model-based Forests

The estimation of heterogeneous treatment effects (HTEs) has attracted considerable interest in many disciplines, most prominently in medicine and economics. Contemporary research has so far primarily focused on continuous and binary responses where HTEs are traditionally estimated by a linear model, which allows the estimation of constant or heterogeneous effects even under certain model misspecifications. More complex models for survival, count, or ordinal outcomes require stricter assumptions to reliably estimate the treatment effect. Most importantly, the noncollapsibility issue necessitates the joint estimation of treatment and prognostic effects. Model-based forests allow simultaneous estimation of covariate-dependent treatment and prognostic effects, but only for randomized trials. In this paper, we propose modifications to model-based forests to address the confounding issue in observational data. In particular, we evaluate an orthogonalization strategy originally proposed by Robinson (1988, Econometrica) in the context of model-based forests targeting HTE estimation in generalized linear models and transformation models. We found that this strategy reduces confounding effects in a simulated study with various outcome distributions. We demonstrate the practical aspects of HTE estimation for survival and ordinal outcomes by an assessment of the potentially heterogeneous effect of Riluzole on the progress of Amyotrophic Lateral Sclerosis.

Viaarxiv icon

Factorized Structured Regression for Large-Scale Varying Coefficient Models

May 25, 2022
David Rügamer, Andreas Bender, Simon Wiegrebe, Daniel Racek, Bernd Bischl, Christian L. Müller, Clemens Stachl

Figure 1 for Factorized Structured Regression for Large-Scale Varying Coefficient Models
Figure 2 for Factorized Structured Regression for Large-Scale Varying Coefficient Models
Figure 3 for Factorized Structured Regression for Large-Scale Varying Coefficient Models
Figure 4 for Factorized Structured Regression for Large-Scale Varying Coefficient Models

Recommender Systems (RS) pervade many aspects of our everyday digital life. Proposed to work at scale, state-of-the-art RS allow the modeling of thousands of interactions and facilitate highly individualized recommendations. Conceptually, many RS can be viewed as instances of statistical regression models that incorporate complex feature effects and potentially non-Gaussian outcomes. Such structured regression models, including time-aware varying coefficients models, are, however, limited in their applicability to categorical effects and inclusion of a large number of interactions. Here, we propose Factorized Structured Regression (FaStR) for scalable varying coefficient models. FaStR overcomes limitations of general regression models for large-scale data by combining structured additive regression and factorization approaches in a neural network-based model implementation. This fusion provides a scalable framework for the estimation of statistical models in previously infeasible data settings. Empirical results confirm that the estimation of varying coefficients of our approach is on par with state-of-the-art regression techniques, while scaling notably better and also being competitive with other time-aware RS in terms of prediction performance. We illustrate FaStR's performance and interpretability on a large-scale behavioral study with smartphone user data.

Viaarxiv icon

DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis

Feb 12, 2022
Philipp Kopper, Simon Wiegrebe, Bernd Bischl, Andreas Bender, David Rügamer

Figure 1 for DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis
Figure 2 for DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis
Figure 3 for DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis
Figure 4 for DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis

Survival analysis (SA) is an active field of research that is concerned with time-to-event outcomes and is prevalent in many domains, particularly biomedical applications. Despite its importance, SA remains challenging due to small-scale data sets and complex outcome distributions, concealed by truncation and censoring processes. The piecewise exponential additive mixed model (PAMM) is a model class addressing many of these challenges, yet PAMMs are not applicable in high-dimensional feature settings or in the case of unstructured or multimodal data. We unify existing approaches by proposing DeepPAMM, a versatile deep learning framework that is well-founded from a statistical point of view, yet with enough flexibility for modeling complex hazard structures. We illustrate that DeepPAMM is competitive with other machine learning approaches with respect to predictive performance while maintaining interpretability through benchmark experiments and an extended case study.

* 13 pages, 2 figures, This work has been accepted by the 26th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2022) 
Viaarxiv icon

Evaluation of survival distribution predictions with discrimination measures

Dec 09, 2021
Raphael Sonabend, Andreas Bender, Sebastian Vollmer

Figure 1 for Evaluation of survival distribution predictions with discrimination measures

In this paper we consider how to evaluate survival distribution predictions with measures of discrimination. This is a non-trivial problem as discrimination measures are the most commonly used in survival analysis and yet there is no clear method to derive a risk prediction from a distribution prediction. We survey methods proposed in literature and software and consider their respective advantages and disadvantages. Whilst distributions are frequently evaluated by discrimination measures, we find that the method for doing so is rarely described in the literature and often leads to unfair comparisons. We find that the most robust method of reducing a distribution to a risk is to sum over the predicted cumulative hazard. We recommend that machine learning survival analysis software implements clear transformations between distribution and risk predictions in order to allow more transparent and accessible model evaluation.

Viaarxiv icon

DOCKSTRING: easy molecular docking yields better benchmarks for ligand design

Oct 29, 2021
Miguel García-Ortegón, Gregor N. C. Simm, Austin J. Tripp, José Miguel Hernández-Lobato, Andreas Bender, Sergio Bacallado

Figure 1 for DOCKSTRING: easy molecular docking yields better benchmarks for ligand design
Figure 2 for DOCKSTRING: easy molecular docking yields better benchmarks for ligand design
Figure 3 for DOCKSTRING: easy molecular docking yields better benchmarks for ligand design
Figure 4 for DOCKSTRING: easy molecular docking yields better benchmarks for ligand design

The field of machine learning for drug discovery is witnessing an explosion of novel methods. These methods are often benchmarked on simple physicochemical properties such as solubility or general druglikeness, which can be readily computed. However, these properties are poor representatives of objective functions in drug design, mainly because they do not depend on the candidate's interaction with the target. By contrast, molecular docking is a widely successful method in drug discovery to estimate binding affinities. However, docking simulations require a significant amount of domain knowledge to set up correctly which hampers adoption. To this end, we present DOCKSTRING, a bundle for meaningful and robust comparison of ML models consisting of three components: (1) an open-source Python package for straightforward computation of docking scores; (2) an extensive dataset of docking scores and poses of more than 260K ligands for 58 medically-relevant targets; and (3) a set of pharmaceutically-relevant benchmark tasks including regression, virtual screening, and de novo design. The Python package implements a robust ligand and target preparation protocol that allows non-experts to obtain meaningful docking scores. Our dataset is the first to include docking poses, as well as the first of its size that is a full matrix, thus facilitating experiments in multiobjective optimization and transfer learning. Overall, our results indicate that docking scores are a more appropriate evaluation objective than simple physicochemical properties, yielding more realistic benchmark tasks and molecular candidates.

Viaarxiv icon

A Review of Biomedical Datasets Relating to Drug Discovery: A Knowledge Graph Perspective

Feb 26, 2021
Stephen Bonner, Ian P Barrett, Cheng Ye, Rowan Swiers, Ola Engkvist, Andreas Bender, William Hamilton

Figure 1 for A Review of Biomedical Datasets Relating to Drug Discovery: A Knowledge Graph Perspective
Figure 2 for A Review of Biomedical Datasets Relating to Drug Discovery: A Knowledge Graph Perspective
Figure 3 for A Review of Biomedical Datasets Relating to Drug Discovery: A Knowledge Graph Perspective
Figure 4 for A Review of Biomedical Datasets Relating to Drug Discovery: A Knowledge Graph Perspective

Drug discovery and development is an extremely complex process, with high attrition contributing to the costs of delivering new medicines to patients. Recently, various machine learning approaches have been proposed and investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Among these techniques, it is especially those using Knowledge Graphs that are proving to have considerable promise across a range of tasks, including drug repurposing, drug toxicity prediction and target gene-disease prioritisation. In such a knowledge graph-based representation of drug discovery domains, crucial elements including genes, diseases and drugs are represented as entities or vertices, whilst relationships or edges between them indicate some level of interaction. For example, an edge between a disease and drug entity might represent a successful clinical trial, or an edge between two drug entities could indicate a potentially harmful interaction. In order to construct high-quality and ultimately informative knowledge graphs however, suitable data and information is of course required. In this review, we detail publicly available primary data sources containing information suitable for use in constructing various drug discovery focused knowledge graphs. We aim to help guide machine learning and knowledge graph practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. Overall we hope this review will help motivate more machine learning researchers to explore combining knowledge graphs and machine learning to help solve key and emerging questions in the drug discovery domain.

Viaarxiv icon

Semi-Structured Deep Piecewise Exponential Models

Nov 11, 2020
Philipp Kopper, Sebastian Pölsterl, Christian Wachinger, Bernd Bischl, Andreas Bender, David Rügamer

Figure 1 for Semi-Structured Deep Piecewise Exponential Models
Figure 2 for Semi-Structured Deep Piecewise Exponential Models
Figure 3 for Semi-Structured Deep Piecewise Exponential Models
Figure 4 for Semi-Structured Deep Piecewise Exponential Models

We propose a versatile framework for survival analysis that combines advanced concepts from statistics with deep learning. The presented framework is based on piecewise exponential models and thereby supports various survival tasks, such as competing risks and multi-state modeling, and further allows for estimation of time-varying effects and time-varying features. To also include multiple data sources and higher-order interaction effects into the model, we embed the model class in a neural network and thereby enable the simultaneous estimation of both inherently interpretable structured regression inputs as well as deep neural network components which can potentially process additional unstructured data sources. A proof of concept is provided by using the framework to predict Alzheimer's disease progression based on tabular and 3D point cloud data and applying it to synthetic data.

* 8 pages, 3 figures 
Viaarxiv icon

mlr3proba: Machine Learning Survival Analysis in R

Aug 18, 2020
Raphael Sonabend, Franz J. Király, Andreas Bender, Bernd Bischl, Michel Lang

As machine learning has become increasingly popular over the last few decades, so too has the number of machine learning interfaces for implementing these models. However, no consistent interface for evaluation and modelling of survival analysis has emerged despite its vital importance in many fields, including medicine, economics, and engineering. \texttt{mlr3proba} is part of the \texttt{mlr3} ecosystem of machine learning packages for R and facilitates \texttt{mlr3}'s general model tuning and benchmarking by providing a multitude of performance measures and learners for survival analysis with a clean and systematic infrastructure for their evaluation. \texttt{mlr3proba} provides a comprehensive machine learning interface for survival analysis, which allows survival modelling to finally be up to the state-of-art.

* Submitted to JMLR 
Viaarxiv icon