Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rose Yu

Data-Driven Simulator for Mechanical Circulatory Support with Domain Adversarial Neural Process

May 28, 2024

Sophia Sun, Wenyuan Chen, Zihao Zhou, Sonia Fereidooni, Elise Jortberg, Rose Yu

Figure 1 for Data-Driven Simulator for Mechanical Circulatory Support with Domain Adversarial Neural Process

Figure 2 for Data-Driven Simulator for Mechanical Circulatory Support with Domain Adversarial Neural Process

Figure 3 for Data-Driven Simulator for Mechanical Circulatory Support with Domain Adversarial Neural Process

Figure 4 for Data-Driven Simulator for Mechanical Circulatory Support with Domain Adversarial Neural Process

Abstract:Mechanical Circulatory Support (MCS) devices, implemented as a probabilistic deep sequence model. Existing mechanical simulators for MCS rely on oversimplifying assumptions and are insensitive to patient-specific behavior, limiting their applicability to real-world treatment scenarios. To address these shortcomings, our model Domain Adversarial Neural Process (DANP) employs a neural process architecture, allowing it to capture the probabilistic relationship between MCS pump levels and aortic pressure measurements with uncertainty. We use domain adversarial training to combine simulation data with real-world observations, resulting in a more realistic and diverse representation of potential outcomes. Empirical results with an improvement of 19% in non-stationary trend prediction establish DANP as an effective tool for clinicians to understand and make informed decisions regarding MCS patient treatment.

Via

Access Paper or Ask Questions

Symmetry-Informed Governing Equation Discovery

May 27, 2024

Jianke Yang, Wang Rao, Nima Dehmamy, Robin Walters, Rose Yu

Figure 1 for Symmetry-Informed Governing Equation Discovery

Figure 2 for Symmetry-Informed Governing Equation Discovery

Figure 3 for Symmetry-Informed Governing Equation Discovery

Figure 4 for Symmetry-Informed Governing Equation Discovery

Abstract:Despite the advancements in learning governing differential equations from observations of dynamical systems, data-driven methods are often unaware of fundamental physical laws, such as frame invariance. As a result, these algorithms may search an unnecessarily large space and discover equations that are less accurate or overly complex. In this paper, we propose to leverage symmetry in automated equation discovery to compress the equation search space and improve the accuracy and simplicity of the learned equations. Specifically, we derive equivariance constraints from the time-independent symmetries of ODEs. Depending on the types of symmetries, we develop a pipeline for incorporating symmetry constraints into various equation discovery algorithms, including sparse regression and genetic programming. In experiments across a diverse range of dynamical systems, our approach demonstrates better robustness against noise and recovers governing equations with significantly higher probability than baselines without symmetry.

Via

Access Paper or Ask Questions

Understanding the Difficulty of Solving Cauchy Problems with PINNs

May 04, 2024

Tao Wang, Bo Zhao, Sicun Gao, Rose Yu

Figure 1 for Understanding the Difficulty of Solving Cauchy Problems with PINNs

Figure 2 for Understanding the Difficulty of Solving Cauchy Problems with PINNs

Figure 3 for Understanding the Difficulty of Solving Cauchy Problems with PINNs

Figure 4 for Understanding the Difficulty of Solving Cauchy Problems with PINNs

Abstract:Physics-Informed Neural Networks (PINNs) have gained popularity in scientific computing in recent years. However, they often fail to achieve the same level of accuracy as classical methods in solving differential equations. In this paper, we identify two sources of this issue in the case of Cauchy problems: the use of $L^2$ residuals as objective functions and the approximation gap of neural networks. We show that minimizing the sum of $L^2$ residual and initial condition error is not sufficient to guarantee the true solution, as this loss function does not capture the underlying dynamics. Additionally, neural networks are not capable of capturing singularities in the solutions due to the non-compactness of their image sets. This, in turn, influences the existence of global minima and the regularity of the network. We demonstrate that when the global minimum does not exist, machine precision becomes the predominant source of achievable error in practice. We also present numerical experiments in support of our theoretical claims.

* 13 pages and 18 figures

Via

Access Paper or Ask Questions

On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers

Apr 04, 2024

Cai Zhou, Rose Yu, Yusu Wang

Figure 1 for On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers

Figure 2 for On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers

Figure 3 for On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers

Figure 4 for On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers

Abstract:Graph transformers have recently received significant attention in graph learning, partly due to their ability to capture more global interaction via self-attention. Nevertheless, while higher-order graph neural networks have been reasonably well studied, the exploration of extending graph transformers to higher-order variants is just starting. Both theoretical understanding and empirical results are limited. In this paper, we provide a systematic study of the theoretical expressive power of order-$k$ graph transformers and sparse variants. We first show that, an order-$k$ graph transformer without additional structural information is less expressive than the $k$-Weisfeiler Lehman ($k$-WL) test despite its high computational cost. We then explore strategies to both sparsify and enhance the higher-order graph transformers, aiming to improve both their efficiency and expressiveness. Indeed, sparsification based on neighborhood information can enhance the expressive power, as it provides additional information about input graph structures. In particular, we show that a natural neighborhood-based sparse order-$k$ transformer model is not only computationally efficient, but also expressive -- as expressive as $k$-WL test. We further study several other sparse graph attention models that are computationally efficient and provide their expressiveness analysis. Finally, we provide experimental results to show the effectiveness of the different sparsification strategies.

* Accepted to AISTATS 2024. 40 pages

Via

Access Paper or Ask Questions

Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling

Feb 29, 2024

Ruijia Niu, Dongxia Wu, Kai Kim, Yi-An Ma, Duncan Watson-Parris, Rose Yu

Figure 1 for Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling

Figure 2 for Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling

Figure 3 for Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling

Figure 4 for Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling

Abstract:Multi-fidelity surrogate modeling aims to learn an accurate surrogate at the highest fidelity level by combining data from multiple sources. Traditional methods relying on Gaussian processes can hardly scale to high-dimensional data. Deep learning approaches utilize neural network based encoders and decoders to improve scalability. These approaches share encoded representations across fidelities without including corresponding decoder parameters. At the highest fidelity, the representations are decoded with different parameters, making the shared information inherently inaccurate. This hinders inference performance, especially in out-of-distribution scenarios when the highest fidelity data has limited domain coverage. To address these limitations, we propose Multi-fidelity Residual Neural Processes (MFRNP), a novel multi-fidelity surrogate modeling framework. MFRNP optimizes lower fidelity decoders for accurate information sharing by aggregating lower fidelity surrogate outputs and models residual between the aggregation and ground truth on the highest fidelity. We show that MFRNP significantly outperforms current state-of-the-art in learning partial differential equations and a real-world climate modeling task.

* A novel probabilistic inference approach for scalable multi-fidelity surrogate modeling

Via

Access Paper or Ask Questions

MORL-Prompt: An Empirical Analysis of Multi-Objective Reinforcement Learning for Discrete Prompt Optimization

Feb 18, 2024

Yasaman Jafari, Dheeraj Mekala, Rose Yu, Taylor Berg-Kirkpatrick

Abstract:RL-based techniques can be used to search for prompts that when fed into a target language model maximize a set of user-specified reward functions. However, in many target applications, the natural reward functions are in tension with one another -- for example, content preservation vs. style matching in style transfer tasks. Current techniques focus on maximizing the average of reward functions, which does not necessarily lead to prompts that achieve balance across rewards -- an issue that has been well-studied in the multi-objective and robust optimization literature. In this paper, we adapt several techniques for multi-objective optimization to RL-based discrete prompt optimization -- two that consider volume of the Pareto reward surface, and another that chooses an update direction that benefits all rewards simultaneously. We conduct an empirical analysis of these methods on two NLP tasks: style transfer and machine translation, each using three competing reward functions. Our experiments demonstrate that multi-objective methods that directly optimize volume perform better and achieve a better balance of all rewards than those that attempt to find monotonic update directions.

Via

Access Paper or Ask Questions

MFBind: a Multi-Fidelity Approach for Evaluating Drug Compounds in Practical Generative Modeling

Feb 16, 2024

Peter Eckmann, Dongxia Wu, Germano Heinzelmann, Michael K Gilson, Rose Yu

Abstract:Current generative models for drug discovery primarily use molecular docking to evaluate the quality of generated compounds. However, such models are often not useful in practice because even compounds with high docking scores do not consistently show experimental activity. More accurate methods for activity prediction exist, such as molecular dynamics based binding free energy calculations, but they are too computationally expensive to use in a generative model. We propose a multi-fidelity approach, Multi-Fidelity Bind (MFBind), to achieve the optimal trade-off between accuracy and computational cost. MFBind integrates docking and binding free energy simulators to train a multi-fidelity deep surrogate model with active learning. Our deep surrogate model utilizes a pretraining technique and linear prediction heads to efficiently fit small amounts of high-fidelity data. We perform extensive experiments and show that MFBind (1) outperforms other state-of-the-art single and multi-fidelity baselines in surrogate modeling, and (2) boosts the performance of generative models with markedly higher quality compounds.

* 9 pages, 4 figures

Via

Access Paper or Ask Questions

Learning Granger Causality from Instance-wise Self-attentive Hawkes Processes

Feb 06, 2024

Dongxia Wu, Tsuyoshi Idé, Aurélie Lozano, Georgios Kollias, Jiří Navrátil, Naoki Abe, Yi-An Ma, Rose Yu

Abstract:We address the problem of learning Granger causality from asynchronous, interdependent, multi-type event sequences. In particular, we are interested in discovering instance-level causal structures in an unsupervised manner. Instance-level causality identifies causal relationships among individual events, providing more fine-grained information for decision-making. Existing work in the literature either requires strong assumptions, such as linearity in the intensity function, or heuristically defined model parameters that do not necessarily meet the requirements of Granger causality. We propose Instance-wise Self-Attentive Hawkes Processes (ISAHP), a novel deep learning framework that can directly infer the Granger causality at the event instance level. ISAHP is the first neural point process model that meets the requirements of Granger causality. It leverages the self-attention mechanism of the transformer to align with the principles of Granger causality. We empirically demonstrate that ISAHP is capable of discovering complex instance-level causal structures that cannot be handled by classical models. We also show that ISAHP achieves state-of-the-art performance in proxy tasks involving type-level causal discovery and instance-level event type prediction.

Via

Access Paper or Ask Questions

Target-Free Compound Activity Prediction via Few-Shot Learning

Nov 27, 2023

Peter Eckmann, Jake Anderson, Michael K. Gilson, Rose Yu

Abstract:Predicting the activities of compounds against protein-based or phenotypic assays using only a few known compounds and their activities is a common task in target-free drug discovery. Existing few-shot learning approaches are limited to predicting binary labels (active/inactive). However, in real-world drug discovery, degrees of compound activity are highly relevant. We study Few-Shot Compound Activity Prediction (FS-CAP) and design a novel neural architecture to meta-learn continuous compound activities across large bioactivity datasets. Our model aggregates encodings generated from the known compounds and their activities to capture assay information. We also introduce a separate encoder for the unknown compound. We show that FS-CAP surpasses traditional similarity-based techniques as well as other state of the art few-shot learning methods on a variety of target-free drug discovery settings and datasets.

* 9 pages, 2 figures

Via

Access Paper or Ask Questions

Discovering Mixtures of Structural Causal Models from Time Series Data

Oct 10, 2023

Sumanth Varambally, Yi-An Ma, Rose Yu

Figure 1 for Discovering Mixtures of Structural Causal Models from Time Series Data

Figure 2 for Discovering Mixtures of Structural Causal Models from Time Series Data

Figure 3 for Discovering Mixtures of Structural Causal Models from Time Series Data

Figure 4 for Discovering Mixtures of Structural Causal Models from Time Series Data

Abstract:In fields such as finance, climate science, and neuroscience, inferring causal relationships from time series data poses a formidable challenge. While contemporary techniques can handle nonlinear relationships between variables and flexible noise distributions, they rely on the simplifying assumption that data originates from the same underlying causal model. In this work, we relax this assumption and perform causal discovery from time series data originating from mixtures of different causal models. We infer both the underlying structural causal models and the posterior probability for each sample belonging to a specific mixture component. Our approach employs an end-to-end training process that maximizes an evidence-lower bound for data likelihood. Through extensive experimentation on both synthetic and real-world datasets, we demonstrate that our method surpasses state-of-the-art benchmarks in causal discovery tasks, particularly when the data emanates from diverse underlying causal graphs. Theoretically, we prove the identifiability of such a model under some mild assumptions.

Via

Access Paper or Ask Questions