Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rahul G. Krishnan

Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Nov 26, 2024

Alexander Capstick, Rahul G. Krishnan, Payam Barnaghi

Figure 1 for Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Figure 2 for Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Figure 3 for Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Figure 4 for Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Abstract:Large language models (LLMs), trained on diverse data effectively acquire a breadth of information across various domains. However, their computational complexity, cost, and lack of transparency hinder their direct application for specialised tasks. In fields such as clinical research, acquiring expert annotations or prior knowledge about predictive models is often costly and time-consuming. This study proposes using LLMs to elicit expert prior distributions for predictive models. This approach also provides an alternative to in-context learning, where language models are tasked with making predictions directly. We compare LLM-elicited and uninformative priors, evaluate whether LLMs truthfully generate parameter distributions, and propose a model selection strategy for in-context learning and prior elicitation. Our findings show that LLM-elicited prior parameter distributions significantly reduce predictive error compared to uninformative priors in low-data settings. Applied to clinical problems, this translates to fewer required biological samples, lowering cost and resources. Prior elicitation also consistently outperforms and proves more reliable than in-context learning at a lower cost, making it a preferred alternative in our setting. We demonstrate the utility of this method across various use cases, including clinical applications. For infection prediction, using LLM-elicited priors reduced the number of required labels to achieve the same accuracy as an uninformative prior by 55%, at 200 days earlier in the study.

Via

Access Paper or Ask Questions

Learning Predictive Checklists with Probabilistic Logic Programming

Nov 25, 2024

Yukti Makhija, Edward De Brouwer, Rahul G. Krishnan

Figure 1 for Learning Predictive Checklists with Probabilistic Logic Programming

Figure 2 for Learning Predictive Checklists with Probabilistic Logic Programming

Figure 3 for Learning Predictive Checklists with Probabilistic Logic Programming

Figure 4 for Learning Predictive Checklists with Probabilistic Logic Programming

Abstract:Checklists have been widely recognized as effective tools for completing complex tasks in a systematic manner. Although originally intended for use in procedural tasks, their interpretability and ease of use have led to their adoption for predictive tasks as well, including in clinical settings. However, designing checklists can be challenging, often requiring expert knowledge and manual rule design based on available data. Recent work has attempted to address this issue by using machine learning to automatically generate predictive checklists from data, although these approaches have been limited to Boolean data. We propose a novel method for learning predictive checklists from diverse data modalities, such as images and time series. Our approach relies on probabilistic logic programming, a learning paradigm that enables matching the discrete nature of checklist with continuous-valued data. We propose a regularization technique to tradeoff between the information captured in discrete concepts of continuous data and permit a tunable level of interpretability for the learned checklist concepts. We demonstrate that our method outperforms various explainable machine learning techniques on prediction tasks involving image sequences, time series, and clinical notes.

* 36 pages

Via

Access Paper or Ask Questions

Personalized Adaptation via In-Context Preference Learning

Oct 17, 2024

Allison Lau, Younwoo Choi, Vahid Balazadeh, Keertana Chidambaram, Vasilis Syrgkanis, Rahul G. Krishnan

Figure 1 for Personalized Adaptation via In-Context Preference Learning

Figure 2 for Personalized Adaptation via In-Context Preference Learning

Abstract:Reinforcement Learning from Human Feedback (RLHF) is widely used to align Language Models (LMs) with human preferences. However, existing approaches often neglect individual user preferences, leading to suboptimal personalization. We present the Preference Pretrained Transformer (PPT), a novel approach for adaptive personalization using online user feedback. PPT leverages the in-context learning capabilities of transformers to dynamically adapt to individual preferences. Our approach consists of two phases: (1) an offline phase where we train a single policy model using a history-dependent loss function, and (2) an online phase where the model adapts to user preferences through in-context learning. We demonstrate PPT's effectiveness in a contextual bandit setting, showing that it achieves personalized adaptation superior to existing methods while significantly reducing the computational costs. Our results suggest the potential of in-context learning for scalable and efficient personalization in large language models.

Via

Access Paper or Ask Questions

Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Sep 22, 2024

Mohammad R. Rezaei, Rahul G. Krishnan, Milos R. Popovic, Milad Lankarany

Figure 1 for Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Figure 2 for Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Figure 3 for Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Figure 4 for Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Abstract:Conditional Flow Matching (CFM) models can generate high-quality samples from a non-informative prior, but they can be slow, often needing hundreds of network evaluations (NFE). To address this, we propose Implicit Dynamical Flow Fusion (IDFF); IDFF learns a new vector field with an additional momentum term that enables taking longer steps during sample generation while maintaining the fidelity of the generated distribution. Consequently, IDFFs reduce the NFEs by a factor of ten (relative to CFMs) without sacrificing sample quality, enabling rapid sampling and efficient handling of image and time-series data generation tasks. We evaluate IDFF on standard benchmarks such as CIFAR-10 and CelebA for image generation. We achieved likelihood and quality performance comparable to CFMs and diffusion-based models with fewer NFEs. IDFF also shows superior performance on time-series datasets modeling, including molecular simulation and sea surface temperature (SST) datasets, highlighting its versatility and effectiveness across different domains.

Via

Access Paper or Ask Questions

NeRF-US: Removing Ultrasound Imaging Artifacts from Neural Radiance Fields in the Wild

Aug 21, 2024

Rishit Dagli, Atsuhiro Hibi, Rahul G. Krishnan, Pascal N. Tyrrell

Figure 1 for NeRF-US: Removing Ultrasound Imaging Artifacts from Neural Radiance Fields in the Wild

Figure 2 for NeRF-US: Removing Ultrasound Imaging Artifacts from Neural Radiance Fields in the Wild

Figure 3 for NeRF-US: Removing Ultrasound Imaging Artifacts from Neural Radiance Fields in the Wild

Figure 4 for NeRF-US: Removing Ultrasound Imaging Artifacts from Neural Radiance Fields in the Wild

Abstract:Current methods for performing 3D reconstruction and novel view synthesis (NVS) in ultrasound imaging data often face severe artifacts when training NeRF-based approaches. The artifacts produced by current approaches differ from NeRF floaters in general scenes because of the unique nature of ultrasound capture. Furthermore, existing models fail to produce reasonable 3D reconstructions when ultrasound data is captured or obtained casually in uncontrolled environments, which is common in clinical settings. Consequently, existing reconstruction and NVS methods struggle to handle ultrasound motion, fail to capture intricate details, and cannot model transparent and reflective surfaces. In this work, we introduced NeRF-US, which incorporates 3D-geometry guidance for border probability and scattering density into NeRF training, while also utilizing ultrasound-specific rendering over traditional volume rendering. These 3D priors are learned through a diffusion model. Through experiments conducted on our new "Ultrasound in the Wild" dataset, we observed accurate, clinically plausible, artifact-free reconstructions.

Via

Access Paper or Ask Questions

Predicting Long-Term Allograft Survival in Liver Transplant Recipients

Aug 10, 2024

Xiang Gao, Michael Cooper, Maryam Naghibzadeh, Amirhossein Azhie, Mamatha Bhat, Rahul G. Krishnan

Figure 1 for Predicting Long-Term Allograft Survival in Liver Transplant Recipients

Figure 2 for Predicting Long-Term Allograft Survival in Liver Transplant Recipients

Figure 3 for Predicting Long-Term Allograft Survival in Liver Transplant Recipients

Figure 4 for Predicting Long-Term Allograft Survival in Liver Transplant Recipients

Abstract:Liver allograft failure occurs in approximately 20% of liver transplant recipients within five years post-transplant, leading to mortality or the need for retransplantation. Providing an accurate and interpretable model for individualized risk estimation of graft failure is essential for improving post-transplant care. To this end, we introduce the Model for Allograft Survival (MAS), a simple linear risk score that outperforms other advanced survival models. Using longitudinal patient follow-up data from the United States (U.S.), we develop our models on 82,959 liver transplant recipients and conduct multi-site evaluations on 11 regions. Additionally, by testing on a separate non-U.S. cohort, we explore the out-of-distribution generalization performance of various models without additional fine-tuning, a crucial property for clinical deployment. We find that the most complex models are also the ones most vulnerable to distribution shifts despite achieving the best in-distribution performance. Our findings not only provide a strong risk score for predicting long-term graft failure but also suggest that the routine machine learning pipeline with only in-distribution held-out validation could create harmful consequences for patients at deployment.

* Accepted at MLHC 2024

Via

Access Paper or Ask Questions

End-To-End Causal Effect Estimation from Unstructured Natural Language Data

Jul 09, 2024

Nikita Dhawan, Leonardo Cotta, Karen Ullrich, Rahul G. Krishnan, Chris J. Maddison

Figure 1 for End-To-End Causal Effect Estimation from Unstructured Natural Language Data

Figure 2 for End-To-End Causal Effect Estimation from Unstructured Natural Language Data

Figure 3 for End-To-End Causal Effect Estimation from Unstructured Natural Language Data

Figure 4 for End-To-End Causal Effect Estimation from Unstructured Natural Language Data

Abstract:Knowing the effect of an intervention is critical for human decision-making, but current approaches for causal effect estimation rely on manual data collection and structuring, regardless of the causal assumptions. This increases both the cost and time-to-completion for studies. We show how large, diverse observational text data can be mined with large language models (LLMs) to produce inexpensive causal effect estimates under appropriate causal assumptions. We introduce NATURAL, a novel family of causal effect estimators built with LLMs that operate over datasets of unstructured text. Our estimators use LLM conditional distributions (over variables of interest, given the text data) to assist in the computation of classical estimators of causal effect. We overcome a number of technical challenges to realize this idea, such as automating data curation and using LLMs to impute missing information. We prepare six (two synthetic and four real) observational datasets, paired with corresponding ground truth in the form of randomized trials, which we used to systematically evaluate each step of our pipeline. NATURAL estimators demonstrate remarkable performance, yielding causal effect estimates that fall within 3 percentage points of their ground truth counterparts, including on real-world Phase 3/4 clinical trials. Our results suggest that unstructured text data is a rich source of causal effect information, and NATURAL is a first step towards an automated pipeline to tap this resource.

* 26 pages, 10 figures

Via

Access Paper or Ask Questions

InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation

Jun 01, 2024

Jacob Si, Wendy Yusi Cheng, Michael Cooper, Rahul G. Krishnan

Figure 1 for InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation

Figure 2 for InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation

Figure 3 for InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation

Figure 4 for InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation

Abstract:Tabular data are omnipresent in various sectors of industries. Neural networks for tabular data such as TabNet have been proposed to make predictions while leveraging the attention mechanism for interpretability. However, the inferred attention masks are often dense, making it challenging to come up with rationales about the predictive signal. To remedy this, we propose InterpreTabNet, a variant of the TabNet model that models the attention mechanism as a latent variable sampled from a Gumbel-Softmax distribution. This enables us to regularize the model to learn distinct concepts in the attention masks via a KL Divergence regularizer. It prevents overlapping feature selection by promoting sparsity which maximizes the model's efficacy and improves interpretability to determine the important features when predicting the outcome. To assist in the interpretation of feature interdependencies from our model, we employ a large language model (GPT-4) and use prompt engineering to map from the learned feature mask onto natural language text describing the learned signal. Through comprehensive experiments on real-world datasets, we demonstrate that InterpreTabNet outperforms previous methods for interpreting tabular data while attaining competitive accuracy.

* ICML 2024

Via

Access Paper or Ask Questions

Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity

Apr 10, 2024

Vahid Balazadeh, Keertana Chidambaram, Viet Nguyen, Rahul G. Krishnan, Vasilis Syrgkanis

Abstract:We study the problem of online sequential decision-making given auxiliary demonstrations from experts who made their decisions based on unobserved contextual information. These demonstrations can be viewed as solving related but slightly different tasks than what the learner faces. This setting arises in many application domains, such as self-driving cars, healthcare, and finance, where expert demonstrations are made using contextual information, which is not recorded in the data available to the learning agent. We model the problem as a zero-shot meta-reinforcement learning setting with an unknown task distribution and a Bayesian regret minimization objective, where the unobserved tasks are encoded as parameters with an unknown prior. We propose the Experts-as-Priors algorithm (ExPerior), a non-parametric empirical Bayes approach that utilizes the principle of maximum entropy to establish an informative prior over the learner's decision-making problem. This prior enables the application of any Bayesian approach for online decision-making, such as posterior sampling. We demonstrate that our strategy surpasses existing behaviour cloning and online algorithms for multi-armed bandits and reinforcement learning, showcasing the utility of our approach in leveraging expert demonstrations across different decision-making setups.

Via

Access Paper or Ask Questions

A Geometric Explanation of the Likelihood OOD Detection Paradox

Mar 27, 2024

Hamidreza Kamkari, Brendan Leigh Ross, Jesse C. Cresswell, Anthony L. Caterini, Rahul G. Krishnan, Gabriel Loaiza-Ganem

Figure 1 for A Geometric Explanation of the Likelihood OOD Detection Paradox

Figure 2 for A Geometric Explanation of the Likelihood OOD Detection Paradox

Figure 3 for A Geometric Explanation of the Likelihood OOD Detection Paradox

Figure 4 for A Geometric Explanation of the Likelihood OOD Detection Paradox

Abstract:Likelihood-based deep generative models (DGMs) commonly exhibit a puzzling behaviour: when trained on a relatively complex dataset, they assign higher likelihood values to out-of-distribution (OOD) data from simpler sources. Adding to the mystery, OOD samples are never generated by these DGMs despite having higher likelihoods. This two-pronged paradox has yet to be conclusively explained, making likelihood-based OOD detection unreliable. Our primary observation is that high-likelihood regions will not be generated if they contain minimal probability mass. We demonstrate how this seeming contradiction of large densities yet low probability mass can occur around data confined to low-dimensional manifolds. We also show that this scenario can be identified through local intrinsic dimension (LID) estimation, and propose a method for OOD detection which pairs the likelihoods and LID estimates obtained from a pre-trained DGM. Our method can be applied to normalizing flows and score-based diffusion models, and obtains results which match or surpass state-of-the-art OOD detection benchmarks using the same DGM backbones. Our code is available at https://github.com/layer6ai-labs/dgm_ood_detection.

Via

Access Paper or Ask Questions