Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rich Caruana

Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Apr 23, 2023

Zhi Chen, Sarah Tan, Urszula Chajewska, Cynthia Rudin, Rich Caruana

Figure 1 for Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Figure 2 for Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Figure 3 for Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Figure 4 for Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Abstract:Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the consequences may affect healthcare decisions. There are many methods in the literature for dealing with missing values, including state-of-the-art methods which often depend on black-box models for imputation. In this work, we show how recent advances in interpretable machine learning provide a new perspective for understanding and tackling the missing value problem. We propose methods based on high-accuracy glass-box Explainable Boosting Machines (EBMs) that can help users (1) gain new insights on missingness mechanisms and better understand the causes of missingness, and (2) detect -- or even alleviate -- potential risks introduced by imputation algorithms. Experiments on real-world medical datasets illustrate the effectiveness of the proposed methods.

* Preprint of a paper accepted by CHIL 2023

Via

Access Paper or Ask Questions

GAM Coach: Towards Interactive and User-centered Algorithmic Recourse

Mar 01, 2023

Zijie J. Wang, Jennifer Wortman Vaughan, Rich Caruana, Duen Horng Chau

Abstract:Machine learning (ML) recourse techniques are increasingly used in high-stakes domains, providing end users with actions to alter ML predictions, but they assume ML developers understand what input variables can be changed. However, a recourse plan's actionability is subjective and unlikely to match developers' expectations completely. We present GAM Coach, a novel open-source system that adapts integer linear programming to generate customizable counterfactual explanations for Generalized Additive Models (GAMs), and leverages interactive visualizations to enable end users to iteratively generate recourse plans meeting their needs. A quantitative user study with 41 participants shows our tool is usable and useful, and users prefer personalized recourse plans over generic plans. Through a log analysis, we explore how users discover satisfactory recourse plans, and provide empirical evidence that transparency can lead to more opportunities for everyday users to discover counterintuitive patterns in ML models. GAM Coach is available at: https://poloclub.github.io/gam-coach/.

* Accepted to CHI 2023. 20 pages, 12 figures. For a demo video, see https://youtu.be/ubacP34H9XE. For a live demo, visit https://poloclub.github.io/gam-coach/

Via

Access Paper or Ask Questions

Estimating Discontinuous Time-Varying Risk Factors and Treatment Benefits for COVID-19 with Interpretable ML

Nov 15, 2022

Benjamin Lengerich, Mark E. Nunnally, Yin Aphinyanaphongs, Rich Caruana

Abstract:Treatment protocols, disease understanding, and viral characteristics changed over the course of the COVID-19 pandemic; as a result, the risks associated with patient comorbidities and biomarkers also changed. We add to the conversation regarding inflammation, hemostasis and vascular function in COVID-19 by performing a time-varying observational analysis of over 4000 patients hospitalized for COVID-19 in a New York City hospital system from March 2020 to August 2021. To perform this analysis, we apply tree-based generalized additive models with temporal interactions which recover discontinuous risk changes caused by discrete protocols changes. We find that the biomarkers of thrombosis increasingly predicted mortality from March 2020 to August 2021, while the association between biomarkers of inflammation and thrombosis weakened. Beyond COVID-19, this presents a straightforward methodology to estimate unknown and discontinuous time-varying effects.

* Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 14 pages

Via

Access Paper or Ask Questions

Using Interpretable Machine Learning to Predict Maternal and Fetal Outcomes

Jul 12, 2022

Tomas M. Bosschieter, Zifei Xu, Hui Lan, Benjamin J. Lengerich, Harsha Nori, Kristin Sitcov, Vivienne Souter, Rich Caruana

Figure 1 for Using Interpretable Machine Learning to Predict Maternal and Fetal Outcomes

Figure 2 for Using Interpretable Machine Learning to Predict Maternal and Fetal Outcomes

Figure 3 for Using Interpretable Machine Learning to Predict Maternal and Fetal Outcomes

Figure 4 for Using Interpretable Machine Learning to Predict Maternal and Fetal Outcomes

Abstract:Most pregnancies and births result in a good outcome, but complications are not uncommon and when they do occur, they can be associated with serious implications for mothers and babies. Predictive modeling has the potential to improve outcomes through better understanding of risk factors, heightened surveillance, and more timely and appropriate interventions, thereby helping obstetricians deliver better care. For three types of complications we identify and study the most important risk factors using Explainable Boosting Machine (EBM), a glass box model, in order to gain intelligibility: (i) Severe Maternal Morbidity (SMM), (ii) shoulder dystocia, and (iii) preterm preeclampsia. While using the interpretability of EBM's to reveal surprising insights into the features contributing to risk, our experiments show EBMs match the accuracy of other black-box ML methods such as deep neural nets and random forests.

* DSHealth at SIGKDD 2022, 5 pages, 3 figures

Via

Access Paper or Ask Questions

Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

Jun 30, 2022

Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana

Figure 1 for Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

Figure 2 for Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

Figure 3 for Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

Figure 4 for Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

Abstract:Machine learning (ML) interpretability techniques can reveal undesirable patterns in data that models exploit to make predictions--potentially causing harms once deployed. However, how to take action to address these patterns is not always clear. In a collaboration between ML and human-computer interaction researchers, physicians, and data scientists, we develop GAM Changer, the first interactive system to help domain experts and data scientists easily and responsibly edit Generalized Additive Models (GAMs) and fix problematic patterns. With novel interaction techniques, our tool puts interpretability into action--empowering users to analyze, validate, and align model behaviors with their knowledge and values. Physicians have started to use our tool to investigate and fix pneumonia and sepsis risk prediction models, and an evaluation with 7 data scientists working in diverse domains highlights that our tool is easy to use, meets their model editing needs, and fits into their current workflows. Built with modern web technologies, our tool runs locally in users' web browsers or computational notebooks, lowering the barrier to use. GAM Changer is available at the following public demo link: https://interpret.ml/gam-changer.

* Accepted at KDD 2022. 11 pages, 19 figures. For a demo video, see https://youtu.be/D6whtfInqTc. For a live demo, visit https://interpret.ml/gam-changer

Via

Access Paper or Ask Questions

Differentially Private Estimation of Heterogeneous Causal Effects

Feb 22, 2022

Fengshi Niu, Harsha Nori, Brian Quistorff, Rich Caruana, Donald Ngwe, Aadharsh Kannan

Figure 1 for Differentially Private Estimation of Heterogeneous Causal Effects

Figure 2 for Differentially Private Estimation of Heterogeneous Causal Effects

Figure 3 for Differentially Private Estimation of Heterogeneous Causal Effects

Figure 4 for Differentially Private Estimation of Heterogeneous Causal Effects

Abstract:Estimating heterogeneous treatment effects in domains such as healthcare or social science often involves sensitive data where protecting privacy is important. We introduce a general meta-algorithm for estimating conditional average treatment effects (CATE) with differential privacy (DP) guarantees. Our meta-algorithm can work with simple, single-stage CATE estimators such as S-learner and more complex multi-stage estimators such as DR and R-learner. We perform a tight privacy analysis by taking advantage of sample splitting in our meta-algorithm and the parallel composition property of differential privacy. In this paper, we implement our approach using DP-EBMs as the base learner. DP-EBMs are interpretable, high-accuracy models with privacy guarantees, which allow us to directly observe the impact of DP noise on the learned causal model. Our experiments show that multi-stage CATE estimators incur larger accuracy loss than single-stage CATE or ATE estimators and that most of the accuracy loss from differential privacy is due to an increase in variance, not biased estimates of treatment effects.

Via

Access Paper or Ask Questions

GAM Changer: Editing Generalized Additive Models with Interactive Visualization

Dec 06, 2021

Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana

Figure 1 for GAM Changer: Editing Generalized Additive Models with Interactive Visualization

Figure 2 for GAM Changer: Editing Generalized Additive Models with Interactive Visualization

Figure 3 for GAM Changer: Editing Generalized Additive Models with Interactive Visualization

Figure 4 for GAM Changer: Editing Generalized Additive Models with Interactive Visualization

Abstract:Recent strides in interpretable machine learning (ML) research reveal that models exploit undesirable patterns in the data to make predictions, which potentially causes harms in deployment. However, it is unclear how we can fix these models. We present our ongoing work, GAM Changer, an open-source interactive system to help data scientists and domain experts easily and responsibly edit their Generalized Additive Models (GAMs). With novel visualization techniques, our tool puts interpretability into action -- empowering human users to analyze, validate, and align model behaviors with their knowledge and values. Built using modern web technologies, our tool runs locally in users' computational notebooks or web browsers without requiring extra compute resources, lowering the barrier to creating more responsible ML models. GAM Changer is available at https://interpret.ml/gam-changer.

* 7 pages, 15 figures, accepted to the Research2Clinics workshop at NeurIPS 2021. For a demo video, see https://youtu.be/2gVSoPoSeJ8. For a live demo, visit https://interpret.ml/gam-changer/

Via

Access Paper or Ask Questions

Extracting Clinician's Goals by What-if Interpretable Modeling

Oct 28, 2021

Chun-Hao Chang, George Alexandru Adam, Rich Caruana, Anna Goldenberg

Figure 1 for Extracting Clinician's Goals by What-if Interpretable Modeling

Figure 2 for Extracting Clinician's Goals by What-if Interpretable Modeling

Figure 3 for Extracting Clinician's Goals by What-if Interpretable Modeling

Figure 4 for Extracting Clinician's Goals by What-if Interpretable Modeling

Abstract:Although reinforcement learning (RL) has tremendous success in many fields, applying RL to real-world settings such as healthcare is challenging when the reward is hard to specify and no exploration is allowed. In this work, we focus on recovering clinicians' rewards in treating patients. We incorporate the what-if reasoning to explain clinician's actions based on future outcomes. We use generalized additive models (GAMs) - a class of accurate, interpretable models - to recover the reward. In both simulation and a real-world hospital dataset, we show our model outperforms baselines. Finally, our model's explanations match several clinical guidelines when treating patients while we found the previously-used linear model often contradicts them.

* Submitted to AISTATS 2022

Via

Access Paper or Ask Questions

Accuracy, Interpretability, and Differential Privacy via Explainable Boosting

Jun 17, 2021

Harsha Nori, Rich Caruana, Zhiqi Bu, Judy Hanwen Shen, Janardhan Kulkarni

Abstract:We show that adding differential privacy to Explainable Boosting Machines (EBMs), a recent method for training interpretable ML models, yields state-of-the-art accuracy while protecting privacy. Our experiments on multiple classification and regression datasets show that DP-EBM models suffer surprisingly little accuracy loss even with strong differential privacy guarantees. In addition to high accuracy, two other benefits of applying DP to EBMs are: a) trained models provide exact global and local interpretability, which is often important in settings where differential privacy is needed; and b) the models can be edited after training without loss of privacy to correct errors which DP noise may have introduced.

* To be published in ICML 2021. 12 pages, 6 figures

Via

Access Paper or Ask Questions

NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning

Jun 03, 2021

Chun-Hao Chang, Rich Caruana, Anna Goldenberg

Figure 1 for NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning

Figure 2 for NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning

Figure 3 for NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning

Figure 4 for NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning

Abstract:Deployment of machine learning models in real high-risk settings (e.g. healthcare) often depends not only on model's accuracy but also on its fairness, robustness and interpretability. Generalized Additive Models (GAMs) have a long history of use in these high-risk domains, but lack desirable features of deep learning such as differentiability and scalability. In this work, we propose a neural GAM (NODE-GAM) and neural GA$^2$M (NODE-GA$^2$M) that scale well to large datasets, while remaining interpretable and accurate. We show that our proposed models have comparable accuracy to other non-interpretable models, and outperform other GAMs on large datasets. We also show that our models are more accurate in self-supervised learning setting when access to labeled data is limited.

Via

Access Paper or Ask Questions