Alert button
Picture for Moninder Singh

Moninder Singh

Alert button

Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions

Feb 17, 2023
Manish Nagireddy, Moninder Singh, Samuel C. Hoffman, Evaline Ju, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

Figure 1 for Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions
Figure 2 for Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions
Figure 3 for Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions
Figure 4 for Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions

Ensuring trustworthiness in machine learning (ML) models is a multi-dimensional task. In addition to the traditional notion of predictive performance, other notions such as privacy, fairness, robustness to distribution shift, adversarial robustness, interpretability, explainability, and uncertainty quantification are important considerations to evaluate and improve (if deficient). However, these sub-disciplines or 'pillars' of trustworthiness have largely developed independently, which has limited us from understanding their interactions in real-world ML pipelines. In this paper, focusing specifically on compositions of functions arising from the different pillars, we aim to reduce this gap, develop new insights for trustworthy ML, and answer questions such as the following. Does the composition of multiple fairness interventions result in a fairer model compared to a single intervention? How do bias mitigation algorithms for fairness affect local post-hoc explanations? Does a defense algorithm for untargeted adversarial attacks continue to be effective when composed with a privacy transformation? Toward this end, we report initial empirical results and new insights from 9 different compositions of functions (or pipelines) on 7 real-world datasets along two trustworthy dimensions - fairness and explainability. We also report progress, and implementation choices, on an extensible composer tool to encourage the combination of functionalities from multiple pillars. To-date, the tool supports bias mitigation algorithms for fairness and post-hoc explainability methods. We hope this line of work encourages the thoughtful consideration of multiple pillars when attempting to formulate and resolve a trustworthiness problem.

Viaarxiv icon

On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach

Nov 02, 2022
Dennis Wei, Rahul Nair, Amit Dhurandhar, Kush R. Varshney, Elizabeth M. Daly, Moninder Singh

Figure 1 for On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach
Figure 2 for On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach
Figure 3 for On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach
Figure 4 for On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach

Interpretable and explainable machine learning has seen a recent surge of interest. We focus on safety as a key motivation behind the surge and make the relationship between interpretability and safety more quantitative. Toward assessing safety, we introduce the concept of maximum deviation via an optimization problem to find the largest deviation of a supervised learning model from a reference model regarded as safe. We then show how interpretability facilitates this safety assessment. For models including decision trees, generalized linear and additive models, the maximum deviation can be computed exactly and efficiently. For tree ensembles, which are not regarded as interpretable, discrete optimization techniques can still provide informative bounds. For a broader class of piecewise Lipschitz functions, we leverage the multi-armed bandit literature to show that interpretability produces tighter (regret) bounds on the maximum deviation. We present case studies, including one on mortgage approval, to illustrate our methods and the insights about models that may be obtained from deviation maximization.

* Published at NeurIPS 2022 
Viaarxiv icon

Anomaly Attribution with Likelihood Compensation

Aug 23, 2022
Tsuyoshi Idé, Amit Dhurandhar, Jiří Navrátil, Moninder Singh, Naoki Abe

Figure 1 for Anomaly Attribution with Likelihood Compensation
Figure 2 for Anomaly Attribution with Likelihood Compensation
Figure 3 for Anomaly Attribution with Likelihood Compensation
Figure 4 for Anomaly Attribution with Likelihood Compensation

This paper addresses the task of explaining anomalous predictions of a black-box regression model. When using a black-box model, such as one to predict building energy consumption from many sensor measurements, we often have a situation where some observed samples may significantly deviate from their prediction. It may be due to a sub-optimal black-box model, or simply because those samples are outliers. In either case, one would ideally want to compute a ``responsibility score'' indicative of the extent to which an input variable is responsible for the anomalous output. In this work, we formalize this task as a statistical inverse problem: Given model deviation from the expected value, infer the responsibility score of each of the input variables. We propose a new method called likelihood compensation (LC), which is founded on the likelihood principle and computes a correction to each input variable. To the best of our knowledge, this is the first principled framework that computes a responsibility score for real valued anomalous model deviations. We apply our approach to a real-world building energy prediction task and confirm its utility based on expert feedback.

* 8 pages, 7 figures 
Viaarxiv icon

Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations

May 08, 2022
Hammaad Adam, Ming Ying Yang, Kenrick Cato, Ioana Baldini, Charles Senteio, Leo Anthony Celi, Jiaming Zeng, Moninder Singh, Marzyeh Ghassemi

Figure 1 for Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations
Figure 2 for Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations
Figure 3 for Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations
Figure 4 for Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations

Clinical notes are becoming an increasingly important data source for machine learning (ML) applications in healthcare. Prior research has shown that deploying ML models can perpetuate existing biases against racial minorities, as bias can be implicitly embedded in data. In this study, we investigate the level of implicit race information available to ML models and human experts and the implications of model-detectable differences in clinical notes. Our work makes three key contributions. First, we find that models can identify patient self-reported race from clinical notes even when the notes are stripped of explicit indicators of race. Second, we determine that human experts are not able to accurately predict patient race from the same redacted clinical notes. Finally, we demonstrate the potential harm of this implicit information in a simulation study, and show that models trained on these race-redacted clinical notes can still perpetuate existing biases in clinical treatment decisions.

* Accepted to the 2022 AAAI/ACM Conference on AI, Ethics, and Society (AIES '22), ACM, Oxford, UK, 2022 
Viaarxiv icon

Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets

Dec 07, 2021
Kofi Arhin, Ioana Baldini, Dennis Wei, Karthikeyan Natesan Ramamurthy, Moninder Singh

Figure 1 for Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets
Figure 2 for Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets
Figure 3 for Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets
Figure 4 for Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets

The use of machine learning (ML)-based language models (LMs) to monitor content online is on the rise. For toxic text identification, task-specific fine-tuning of these models are performed using datasets labeled by annotators who provide ground-truth labels in an effort to distinguish between offensive and normal content. These projects have led to the development, improvement, and expansion of large datasets over time, and have contributed immensely to research on natural language. Despite the achievements, existing evidence suggests that ML models built on these datasets do not always result in desirable outcomes. Therefore, using a design science research (DSR) approach, this study examines selected toxic text datasets with the goal of shedding light on some of the inherent issues and contributing to discussions on navigating these challenges for existing and future projects. To achieve the goal of the study, we re-annotate samples from three toxic text datasets and find that a multi-label approach to annotating toxic text samples can help to improve dataset quality. While this approach may not improve the traditional metric of inter-annotator agreement, it may better capture dependence on context and diversity in annotators. We discuss the implications of these results for both theory and practice.

* 15 pages 
Viaarxiv icon

An Empirical Study of Accuracy, Fairness, Explainability, Distributional Robustness, and Adversarial Robustness

Sep 29, 2021
Moninder Singh, Gevorg Ghalachyan, Kush R. Varshney, Reginald E. Bryant

Figure 1 for An Empirical Study of Accuracy, Fairness, Explainability, Distributional Robustness, and Adversarial Robustness
Figure 2 for An Empirical Study of Accuracy, Fairness, Explainability, Distributional Robustness, and Adversarial Robustness
Figure 3 for An Empirical Study of Accuracy, Fairness, Explainability, Distributional Robustness, and Adversarial Robustness

To ensure trust in AI models, it is becoming increasingly apparent that evaluation of models must be extended beyond traditional performance metrics, like accuracy, to other dimensions, such as fairness, explainability, adversarial robustness, and distribution shift. We describe an empirical study to evaluate multiple model types on various metrics along these dimensions on several datasets. Our results show that no particular model type performs well on all dimensions, and demonstrate the kinds of trade-offs involved in selecting models evaluated along multiple dimensions.

* presented at the 2021 KDD Workshop on Measures and Best Practices for Responsible AI  
Viaarxiv icon

AI Explainability 360: Impact and Design

Sep 24, 2021
Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilovic, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang

Figure 1 for AI Explainability 360: Impact and Design
Figure 2 for AI Explainability 360: Impact and Design
Figure 3 for AI Explainability 360: Impact and Design
Figure 4 for AI Explainability 360: Impact and Design

As artificial intelligence and machine learning algorithms become increasingly prevalent in society, multiple stakeholders are calling for these algorithms to provide explanations. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, have different explanation needs. To address these needs, in 2019, we created AI Explainability 360 (Arya et al. 2020), an open source software toolkit featuring ten diverse and state-of-the-art explainability methods and two evaluation metrics. This paper examines the impact of the toolkit with several case studies, statistics, and community feedback. The different ways in which users have experienced AI Explainability 360 have resulted in multiple types of impact and improvements in multiple metrics, highlighted by the adoption of the toolkit by the independent LF AI & Data Foundation. The paper also describes the flexible design of the toolkit, examples of its use, and the significant educational material and documentation available to its users.

* arXiv admin note: text overlap with arXiv:1909.03012 
Viaarxiv icon

Your fairness may vary: Group fairness of pretrained language models in toxic text classification

Aug 03, 2021
Ioana Baldini, Dennis Wei, Karthikeyan Natesan Ramamurthy, Mikhail Yurochkin, Moninder Singh

Figure 1 for Your fairness may vary: Group fairness of pretrained language models in toxic text classification
Figure 2 for Your fairness may vary: Group fairness of pretrained language models in toxic text classification
Figure 3 for Your fairness may vary: Group fairness of pretrained language models in toxic text classification
Figure 4 for Your fairness may vary: Group fairness of pretrained language models in toxic text classification

We study the performance-fairness trade-off in more than a dozen fine-tuned LMs for toxic text classification. We empirically show that no blanket statement can be made with respect to the bias of large versus regular versus compressed models. Moreover, we find that focusing on fairness-agnostic performance metrics can lead to models with varied fairness characteristics.

Viaarxiv icon

Understanding racial bias in health using the Medical Expenditure Panel Survey data

Nov 04, 2019
Moninder Singh, Karthikeyan Natesan Ramamurthy

Figure 1 for Understanding racial bias in health using the Medical Expenditure Panel Survey data
Figure 2 for Understanding racial bias in health using the Medical Expenditure Panel Survey data
Figure 3 for Understanding racial bias in health using the Medical Expenditure Panel Survey data
Figure 4 for Understanding racial bias in health using the Medical Expenditure Panel Survey data

Over the years, several studies have demonstrated that there exist significant disparities in health indicators in the United States population across various groups. Healthcare expense is used as a proxy for health in algorithms that drive healthcare systems and this exacerbates the existing bias. In this work, we focus on the presence of racial bias in health indicators in the publicly available, and nationally representative Medical Expenditure Panel Survey (MEPS) data. We show that predictive models for care management trained using this data inherit this bias. Finally, we demonstrate that this inherited bias can be reduced significantly using simple mitigation techniques.

* 8 pages, 8 tables 
Viaarxiv icon

One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques

Sep 14, 2019
Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilović, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang

Figure 1 for One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques
Figure 2 for One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques
Figure 3 for One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques
Figure 4 for One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques

As artificial intelligence and machine learning algorithms make further inroads into society, calls are increasing from multiple stakeholders for these algorithms to explain their outputs. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, present different requirements for explanations. Toward addressing these needs, we introduce AI Explainability 360 (http://aix360.mybluemix.net/), an open-source software toolkit featuring eight diverse and state-of-the-art explainability methods and two evaluation metrics. Equally important, we provide a taxonomy to help entities requiring explanations to navigate the space of explanation methods, not only those in the toolkit but also in the broader literature on explainability. For data scientists and other users of the toolkit, we have implemented an extensible software architecture that organizes methods according to their place in the AI modeling pipeline. We also discuss enhancements to bring research innovations closer to consumers of explanations, ranging from simplified, more accessible versions of algorithms, to tutorials and an interactive web demo to introduce AI explainability to different audiences and application domains. Together, our toolkit and taxonomy can help identify gaps where more explainability methods are needed and provide a platform to incorporate them as they are developed.

Viaarxiv icon