Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Finale Doshi-Velez

A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial

Sep 03, 2024

Anna L. Trella, Kelly W. Zhang, Hinal Jajal, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy

Figure 1 for A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial

Figure 2 for A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial

Figure 3 for A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial

Figure 4 for A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial

Abstract:Dental disease is a prevalent chronic condition associated with substantial financial burden, personal suffering, and increased risk of systemic diseases. Despite widespread recommendations for twice-daily tooth brushing, adherence to recommended oral self-care behaviors remains sub-optimal due to factors such as forgetfulness and disengagement. To address this, we developed Oralytics, a mHealth intervention system designed to complement clinician-delivered preventative care for marginalized individuals at risk for dental disease. Oralytics incorporates an online reinforcement learning algorithm to determine optimal times to deliver intervention prompts that encourage oral self-care behaviors. We have deployed Oralytics in a registered clinical trial. The deployment required careful design to manage challenges specific to the clinical trials setting in the U.S. In this paper, we (1) highlight key design decisions of the RL algorithm that address these challenges and (2) conduct a re-sampling analysis to evaluate algorithm design decisions. A second phase (randomized control trial) of Oralytics is planned to start in spring 2025.

Via

Access Paper or Ask Questions

Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models

Jul 20, 2024

Ze Yu Zhang, Arun Verma, Finale Doshi-Velez, Bryan Kian Hsiang Low

Abstract:Large language models (LLMs) are widely used in decision-making, but their reliability, especially in critical tasks like healthcare, is not well-established. Therefore, understanding how LLMs reason and make decisions is crucial for their safe deployment. This paper investigates how the uncertainty of responses generated by LLMs relates to the information provided in the input prompt. Leveraging the insight that LLMs learn to infer latent concepts during pretraining, we propose a prompt-response concept model that explains how LLMs generate responses and helps understand the relationship between prompts and response uncertainty. We show that the uncertainty decreases as the prompt's informativeness increases, similar to epistemic uncertainty. Our detailed experimental results on real datasets validate our proposed model.

* 27 pages, 11 figures

Via

Access Paper or Ask Questions

Towards Integrating Personal Knowledge into Test-Time Predictions

Jun 12, 2024

Isaac Lage, Sonali Parbhoo, Finale Doshi-Velez

Figure 1 for Towards Integrating Personal Knowledge into Test-Time Predictions

Figure 2 for Towards Integrating Personal Knowledge into Test-Time Predictions

Figure 3 for Towards Integrating Personal Knowledge into Test-Time Predictions

Figure 4 for Towards Integrating Personal Knowledge into Test-Time Predictions

Abstract:Machine learning (ML) models can make decisions based on large amounts of data, but they can be missing personal knowledge available to human users about whom predictions are made. For example, a model trained to predict psychiatric outcomes may know nothing about a patient's social support system, and social support may look different for different patients. In this work, we introduce the problem of human feature integration, which provides a way to incorporate important personal-knowledge from users without domain expertise into ML predictions. We characterize this problem through illustrative user stories and comparisons to existing approaches; we formally describe this problem in a way that paves the ground for future technical solutions; and we provide a proof-of-concept study of a simple version of a solution to this problem in a semi-realistic setting.

Via

Access Paper or Ask Questions

A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

May 31, 2024

Eura Nofshin, Esther Brown, Brian Lim, Weiwei Pan, Finale Doshi-Velez

Figure 1 for A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

Figure 2 for A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

Figure 3 for A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

Figure 4 for A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

Abstract:Existing user studies suggest that different tasks may require explanations with different properties. However, user studies are expensive. In this paper, we introduce a generalizable, cost-effective method for identifying task-relevant explanation properties in silico, which can guide the design of more expensive user studies. We use our approach to identify relevant proxies for three example tasks and validate our simulation with real user studies.

Via

Access Paper or Ask Questions

AI Procurement Checklists: Revisiting Implementation in the Age of AI Governance

Apr 23, 2024

Tom Zick, Mason Kortz, David Eaves, Finale Doshi-Velez

Abstract:Public sector use of AI has been quietly on the rise for the past decade, but only recently have efforts to regulate it entered the cultural zeitgeist. While simple to articulate, promoting ethical and effective roll outs of AI systems in government is a notoriously elusive task. On the one hand there are hard-to-address pitfalls associated with AI-based tools, including concerns about bias towards marginalized communities, safety, and gameability. On the other, there is pressure not to make it too difficult to adopt AI, especially in the public sector which typically has fewer resources than the private sector$\unicode{x2014}$conserving scarce government resources is often the draw of using AI-based tools in the first place. These tensions create a real risk that procedures built to ensure marginalized groups are not hurt by government use of AI will, in practice, be performative and ineffective. To inform the latest wave of regulatory efforts in the United States, we look to jurisdictions with mature regulations around government AI use. We report on lessons learned by officials in Brazil, Singapore and Canada, who have collectively implemented risk categories, disclosure requirements and assessments into the way they procure AI tools. In particular, we investigate two implemented checklists: the Canadian Directive on Automated Decision-Making (CDADM) and the World Economic Forum's AI Procurement in a Box (WEF). We detail three key pitfalls around expertise, risk frameworks and transparency, that can decrease the efficacy of regulations aimed at government AI use and suggest avenues for improvement.

Via

Access Paper or Ask Questions

Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

Mar 13, 2024

Yaniv Yacoby, Weiwei Pan, Finale Doshi-Velez

Figure 1 for Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

Figure 2 for Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

Figure 3 for Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

Figure 4 for Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

Abstract:Inference for Variational Autoencoders (VAEs) consists of learning two models: (1) a generative model, which transforms a simple distribution over a latent space into the distribution over observed data, and (2) an inference model, which approximates the posterior of the latent codes given data. The two components are learned jointly via a lower bound to the generative model's log marginal likelihood. In early phases of joint training, the inference model poorly approximates the latent code posteriors. Recent work showed that this leads optimization to get stuck in local optima, negatively impacting the learned generative model. As such, recent work suggests ensuring a high-quality inference model via iterative training: maximizing the objective function relative to the inference model before every update to the generative model. Unfortunately, iterative training is inefficient, requiring heuristic criteria for reverting from iterative to joint training for speed. Here, we suggest an inference method that trains the generative and inference models independently. It approximates the posterior of the true model a priori; fixing this posterior approximation, we then maximize the lower bound relative to only the generative model. By conventional wisdom, this approach should rely on the true prior and likelihood of the true model to approximate its posterior (which are unknown). However, we show that we can compute a deterministic, model-agnostic posterior approximation (MAPA) of the true model's posterior. We then use MAPA to develop a proof-of-concept inference method. We present preliminary results on low-dimensional synthetic data that (1) MAPA captures the trend of the true posterior, and (2) our MAPA-based inference performs better density estimation with less computation than baselines. Lastly, we present a roadmap for scaling the MAPA-based inference method to high-dimensional data.

Via

Access Paper or Ask Questions

Monitoring Fidelity of Online Reinforcement Learning Algorithms in Clinical Trials

Feb 26, 2024

Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Iris Yan, Finale Doshi-Velez, Susan A. Murphy

Figure 1 for Monitoring Fidelity of Online Reinforcement Learning Algorithms in Clinical Trials

Abstract:Online reinforcement learning (RL) algorithms offer great potential for personalizing treatment for participants in clinical trials. However, deploying an online, autonomous algorithm in the high-stakes healthcare setting makes quality control and data quality especially difficult to achieve. This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in clinical trials. It emphasizes the responsibility of the algorithm to (1) safeguard participants and (2) preserve the scientific utility of the data for post-trial analyses. We also present a framework for pre-deployment planning and real-time monitoring to help algorithm developers and clinical researchers ensure algorithm fidelity. To illustrate our framework's practical application, we present real-world examples from the Oralytics clinical trial. Since Spring 2023, this trial successfully deployed an autonomous, online RL algorithm to personalize behavioral interventions for participants at risk for dental disease.

Via

Access Paper or Ask Questions

Guarantee Regions for Local Explanations

Feb 20, 2024

Marton Havasi, Sonali Parbhoo, Finale Doshi-Velez

Figure 1 for Guarantee Regions for Local Explanations

Figure 2 for Guarantee Regions for Local Explanations

Figure 3 for Guarantee Regions for Local Explanations

Figure 4 for Guarantee Regions for Local Explanations

Abstract:Interpretability methods that utilise local surrogate models (e.g. LIME) are very good at describing the behaviour of the predictive model at a point of interest, but they are not guaranteed to extrapolate to the local region surrounding the point. However, overfitting to the local curvature of the predictive model and malicious tampering can significantly limit extrapolation. We propose an anchor-based algorithm for identifying regions in which local explanations are guaranteed to be correct by explicitly describing those intervals along which the input features can be trusted. Our method produces an interpretable feature-aligned box where the prediction of the local surrogate model is guaranteed to match the predictive model. We demonstrate that our algorithm can be used to find explanations with larger guarantee regions that better cover the data manifold compared to existing baselines. We also show how our method can identify misleading local explanations with significantly poorer guarantee regions.

Via

Access Paper or Ask Questions

Non-Stationary Latent Auto-Regressive Bandits

Feb 05, 2024

Anna L. Trella, Walter Dempsey, Finale Doshi-Velez, Susan A. Murphy

Figure 1 for Non-Stationary Latent Auto-Regressive Bandits

Figure 2 for Non-Stationary Latent Auto-Regressive Bandits

Figure 3 for Non-Stationary Latent Auto-Regressive Bandits

Figure 4 for Non-Stationary Latent Auto-Regressive Bandits

Abstract:We consider the stochastic multi-armed bandit problem with non-stationary rewards. We present a novel formulation of non-stationarity in the environment where changes in the mean reward of the arms over time are due to some unknown, latent, auto-regressive (AR) state of order $k$. We call this new environment the latent AR bandit. Different forms of the latent AR bandit appear in many real-world settings, especially in emerging scientific fields such as behavioral health or education where there are few mechanistic models of the environment. If the AR order $k$ is known, we propose an algorithm that achieves $\tilde{O}(k\sqrt{T})$ regret in this setting. Empirically, our algorithm outperforms standard UCB across multiple non-stationary environments, even if $k$ is mis-specified.

Via

Access Paper or Ask Questions

Semi-parametric Expert Bayesian Network Learning with Gaussian Processes and Horseshoe Priors

Jan 29, 2024

Yidou Weng, Finale Doshi-Velez

Abstract:This paper proposes a model learning Semi-parametric relationships in an Expert Bayesian Network (SEBN) with linear parameter and structure constraints. We use Gaussian Processes and a Horseshoe prior to introduce minimal nonlinear components. To prioritize modifying the expert graph over adding new edges, we optimize differential Horseshoe scales. In real-world datasets with unknown truth, we generate diverse graphs to accommodate user input, addressing identifiability issues and enhancing interpretability. Evaluation on synthetic and UCI Liver Disorders datasets, using metrics like structural Hamming Distance and test likelihood, demonstrates our models outperform state-of-the-art semi-parametric Bayesian Network model.

* 8 pages, 4 figures, AAAI-2024 workshops

Via

Access Paper or Ask Questions