Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mihaela van der Schaar

Machine Learning with Requirements: a Manifesto

Apr 07, 2023

Eleonora Giunchiglia, Fergus Imrie, Mihaela van der Schaar, Thomas Lukasiewicz

Abstract:In the recent years, machine learning has made great advancements that have been at the root of many breakthroughs in different application domains. However, it is still an open issue how make them applicable to high-stakes or safety-critical application domains, as they can often be brittle and unreliable. In this paper, we argue that requirements definition and satisfaction can go a long way to make machine learning models even more fitting to the real world, especially in critical domains. To this end, we present two problems in which (i) requirements arise naturally, (ii) machine learning models are or can be fruitfully deployed, and (iii) neglecting the requirements can have dramatic consequences. We show how the requirements specification can be fruitfully integrated into the standard machine learning development pipeline, proposing a novel pyramid development process in which requirements definition may impact all the subsequent phases in the pipeline, and viceversa.

Via

Access Paper or Ask Questions

TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization

Mar 09, 2023

Alan Jeffares, Tennison Liu, Jonathan Crabbé, Fergus Imrie, Mihaela van der Schaar

Abstract:Despite their success with unstructured data, deep neural networks are not yet a panacea for structured tabular data. In the tabular domain, their efficiency crucially relies on various forms of regularization to prevent overfitting and provide strong generalization performance. Existing regularization techniques include broad modelling decisions such as choice of architecture, loss functions, and optimization methods. In this work, we introduce Tabular Neural Gradient Orthogonalization and Specialization (TANGOS), a novel framework for regularization in the tabular setting built on latent unit attributions. The gradient attribution of an activation with respect to a given input feature suggests how the neuron attends to that feature, and is often employed to interpret the predictions of deep networks. In TANGOS, we take a different approach and incorporate neuron attributions directly into training to encourage orthogonalization and specialization of latent attributions in a fully-connected network. Our regularizer encourages neurons to focus on sparse, non-overlapping input features and results in a set of diverse and specialized latent units. In the tabular domain, we demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods. We provide insight into why our regularizer is effective and demonstrate that TANGOS can be applied jointly with existing methods to achieve even greater generalization performance.

* Published at International Conference on Learning Representations (ICLR) 2023

Via

Access Paper or Ask Questions

Causal Deep Learning

Mar 03, 2023

Jeroen Berrevoets, Krzysztof Kacprzyk, Zhaozhi Qian, Mihaela van der Schaar

Abstract:Causality has the potential to truly transform the way we solve a large number of real-world problems. Yet, so far, its potential remains largely unlocked since most work so far requires strict assumptions which do not hold true in practice. To address this challenge and make progress in solving real-world problems, we propose a new way of thinking about causality - we call this causal deep learning. The framework which we propose for causal deep learning spans three dimensions: (1) a structural dimension, which allows incomplete causal knowledge rather than assuming either full or no causal knowledge; (2) a parametric dimension, which encompasses parametric forms which are typically ignored; and finally, (3) a temporal dimension, which explicitly allows for situations which capture exposure times or temporal structure. Together, these dimensions allow us to make progress on a variety of real-world problems by leveraging (sometimes incomplete) causal knowledge and/or combining diverse causal deep learning methods. This new framework also enables researchers to compare systematically across existing works as well as identify promising research areas which can lead to real-world impact.

* arXiv admin note: substantial text overlap with arXiv:2212.00911

Via

Access Paper or Ask Questions

Learning machines for health and beyond

Mar 02, 2023

Mahed Abroshan, Oscar Giles, Sam Greenbury, Jack Roberts, Mihaela van der Schaar, Jannetta S Steyn, Alan Wilson, May Yong

Figure 1 for Learning machines for health and beyond

Figure 2 for Learning machines for health and beyond

Figure 3 for Learning machines for health and beyond

Figure 4 for Learning machines for health and beyond

Abstract:Machine learning techniques are effective for building predictive models because they are good at identifying patterns in large datasets. Development of a model for complex real life problems often stops at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as soon as patient demographic changes. The maintenance and monitoring of predictive models post-publication is crucial to guarantee their safe and effective long term use. As machine learning techniques are effectively trained to look for patterns in available datasets, the performance of a model for complex real life problems will not peak and remain fixed at the point of publication or even point of deployment. Rather, data changes over time, and they also changed when models are transported to new places to be used by new demography.

* 12 pages, 3 figures

Via

Access Paper or Ask Questions

T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression

Feb 24, 2023

Yuchao Qin, Mihaela van der Schaar, Changhee Lee

Figure 1 for T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression

Figure 2 for T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression

Figure 3 for T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression

Figure 4 for T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression

Abstract:Clustering time-series data in healthcare is crucial for clinical phenotyping to understand patients' disease progression patterns and to design treatment guidelines tailored to homogeneous patient subgroups. While rich temporal dynamics enable the discovery of potential clusters beyond static correlations, two major challenges remain outstanding: i) discovery of predictive patterns from many potential temporal correlations in the multi-variate time-series data and ii) association of individual temporal patterns to the target label distribution that best characterizes the underlying clinical progression. To address such challenges, we develop a novel temporal clustering method, T-Phenotype, to discover phenotypes of predictive temporal patterns from labeled time-series data. We introduce an efficient representation learning approach in frequency domain that can encode variable-length, irregularly-sampled time-series into a unified representation space, which is then applied to identify various temporal patterns that potentially contribute to the target label using a new notion of path-based similarity. Throughout the experiments on synthetic and real-world datasets, we show that T-Phenotype achieves the best phenotype discovery performance over all the evaluated baselines. We further demonstrate the utility of T-Phenotype by uncovering clinically meaningful patient subgroups characterized by unique temporal patterns.

Via

Access Paper or Ask Questions

Neural Laplace Control for Continuous-time Delayed Systems

Feb 24, 2023

Samuel Holt, Alihan Hüyük, Zhaozhi Qian, Hao Sun, Mihaela van der Schaar

Figure 1 for Neural Laplace Control for Continuous-time Delayed Systems

Figure 2 for Neural Laplace Control for Continuous-time Delayed Systems

Figure 3 for Neural Laplace Control for Continuous-time Delayed Systems

Figure 4 for Neural Laplace Control for Continuous-time Delayed Systems

Abstract:Many real-world offline reinforcement learning (RL) problems involve continuous-time environments with delays. Such environments are characterized by two distinctive features: firstly, the state x(t) is observed at irregular time intervals, and secondly, the current action a(t) only affects the future state x(t + g) with an unknown delay g > 0. A prime example of such an environment is satellite control where the communication link between earth and a satellite causes irregular observations and delays. Existing offline RL algorithms have achieved success in environments with irregularly observed states in time or known delays. However, environments involving both irregular observations in time and unknown delays remains an open and challenging problem. To this end, we propose Neural Laplace Control, a continuous-time model-based offline RL method that combines a Neural Laplace dynamics model with a model predictive control (MPC) planner--and is able to learn from an offline dataset sampled with irregular time intervals from an environment that has a inherent unknown constant delay. We show experimentally on continuous-time delayed environments it is able to achieve near expert policy performance.

Via

Access Paper or Ask Questions

SurvivalGAN: Generating Time-to-Event Data for Survival Analysis

Feb 24, 2023

Alexander Norcliffe, Bogdan Cebere, Fergus Imrie, Pietro Lio, Mihaela van der Schaar

Figure 1 for SurvivalGAN: Generating Time-to-Event Data for Survival Analysis

Figure 2 for SurvivalGAN: Generating Time-to-Event Data for Survival Analysis

Figure 3 for SurvivalGAN: Generating Time-to-Event Data for Survival Analysis

Figure 4 for SurvivalGAN: Generating Time-to-Event Data for Survival Analysis

Abstract:Synthetic data is becoming an increasingly promising technology, and successful applications can improve privacy, fairness, and data democratization. While there are many methods for generating synthetic tabular data, the task remains non-trivial and unexplored for specific scenarios. One such scenario is survival data. Here, the key difficulty is censoring: for some instances, we are not aware of the time of event, or if one even occurred. Imbalances in censoring and time horizons cause generative models to experience three new failure modes specific to survival analysis: (1) generating too few at-risk members; (2) generating too many at-risk members; and (3) censoring too early. We formalize these failure modes and provide three new generative metrics to quantify them. Following this, we propose SurvivalGAN, a generative model that handles survival data firstly by addressing the imbalance in the censoring and event horizons, and secondly by using a dedicated mechanism for approximating time-to-event/censoring. We evaluate this method via extensive experiments on medical datasets. SurvivalGAN outperforms multiple baselines at generating survival data, and in particular addresses the failure modes as measured by the new metrics, in addition to improving downstream performance of survival models trained on the synthetic data.

Via

Access Paper or Ask Questions

Membership Inference Attacks against Synthetic Data through Overfitting Detection

Feb 24, 2023

Boris van Breugel, Hao Sun, Zhaozhi Qian, Mihaela van der Schaar

Abstract:Data is the foundation of most science. Unfortunately, sharing data can be obstructed by the risk of violating data privacy, impeding research in fields like healthcare. Synthetic data is a potential solution. It aims to generate data that has the same distribution as the original data, but that does not disclose information about individuals. Membership Inference Attacks (MIAs) are a common privacy attack, in which the attacker attempts to determine whether a particular real sample was used for training of the model. Previous works that propose MIAs against generative models either display low performance -- giving the false impression that data is highly private -- or need to assume access to internal generative model parameters -- a relatively low-risk scenario, as the data publisher often only releases synthetic data, not the model. In this work we argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution. We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model. Experimentally we show that DOMIAS is significantly more successful at MIA than previous work, especially at attacking uncommon samples. The latter is disconcerting since these samples may correspond to underrepresented groups. We also demonstrate how DOMIAS' MIA performance score provides an interpretable metric for privacy, giving data publishers a new tool for achieving the desired privacy-utility trade-off in their synthetic data.

Via

Access Paper or Ask Questions

Understanding the Impact of Competing Events on Heterogeneous Treatment Effect Estimation from Time-to-Event Data

Feb 23, 2023

Alicia Curth, Mihaela van der Schaar

Figure 1 for Understanding the Impact of Competing Events on Heterogeneous Treatment Effect Estimation from Time-to-Event Data

Figure 2 for Understanding the Impact of Competing Events on Heterogeneous Treatment Effect Estimation from Time-to-Event Data

Figure 3 for Understanding the Impact of Competing Events on Heterogeneous Treatment Effect Estimation from Time-to-Event Data

Figure 4 for Understanding the Impact of Competing Events on Heterogeneous Treatment Effect Estimation from Time-to-Event Data

Abstract:We study the problem of inferring heterogeneous treatment effects (HTEs) from time-to-event data in the presence of competing events. Albeit its great practical relevance, this problem has received little attention compared to its counterparts studying HTE estimation without time-to-event data or competing events. We take an outcome modeling approach to estimating HTEs, and consider how and when existing prediction models for time-to-event data can be used as plug-in estimators for potential outcomes. We then investigate whether competing events present new challenges for HTE estimation -- in addition to the standard confounding problem --, and find that, because there are multiple definitions of causal effects in this setting -- namely total, direct and separable effects --, competing events can act as an additional source of covariate shift depending on the desired treatment effect interpretation and associated estimand. We theoretically analyze and empirically illustrate when and how these challenges play a role when using generic machine learning prediction models for the estimation of HTEs.

* To appear in the Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023, Valencia, Spain. PMLR: Volume 206

Via

Access Paper or Ask Questions

Improving Adaptive Conformal Prediction Using Self-Supervised Learning

Feb 23, 2023

Nabeel Seedat, Alan Jeffares, Fergus Imrie, Mihaela van der Schaar

Abstract:Conformal prediction is a powerful distribution-free tool for uncertainty quantification, establishing valid prediction intervals with finite-sample guarantees. To produce valid intervals which are also adaptive to the difficulty of each instance, a common approach is to compute normalized nonconformity scores on a separate calibration set. Self-supervised learning has been effectively utilized in many domains to learn general representations for downstream predictors. However, the use of self-supervision beyond model pretraining and representation learning has been largely unexplored. In this work, we investigate how self-supervised pretext tasks can improve the quality of the conformal regressors, specifically by improving the adaptability of conformal intervals. We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores. We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.

* Accepted to the International Conference on Artificial Intelligence and Statistics (AISTATS 2023). *Seedat & Jeffares contributed equally

Via

Access Paper or Ask Questions