Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chandan K. Reddy

Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach

Oct 29, 2023

Nurendra Choudhary, Nikhil Rao, Chandan K. Reddy

Figure 1 for Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach

Figure 2 for Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach

Figure 3 for Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach

Figure 4 for Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach

Abstract:The progress in hyperbolic neural networks (HNNs) research is hindered by their absence of inductive bias mechanisms, which are essential for generalizing to new tasks and facilitating scalable learning over large datasets. In this paper, we aim to alleviate these issues by learning generalizable inductive biases from the nodes' local subgraph and transfer them for faster learning over new subgraphs with a disjoint set of nodes, edges, and labels in a few-shot setting. We introduce a novel method, Hyperbolic GRAph Meta Learner (H-GRAM), that, for the tasks of node classification and link prediction, learns transferable information from a set of support local subgraphs in the form of hyperbolic meta gradients and label hyperbolic protonets to enable faster learning over a query set of new tasks dealing with disjoint subgraphs. Furthermore, we show that an extension of our meta-learning framework also mitigates the scalability challenges seen in HNNs faced by existing approaches. Our comparative analysis shows that H-GRAM effectively learns and transfers information in multiple challenging few-shot settings compared to other state-of-the-art baselines. Additionally, we demonstrate that, unlike standard HNNs, our approach is able to scale over large graph datasets and improve performance over its Euclidean counterparts.

* Accepted to NeurIPS 2023. 14 pages of main paper, 5 pages of supplementary

Via

Access Paper or Ask Questions

SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Oct 19, 2023

Kazem Meidani, Parshin Shojaee, Chandan K. Reddy, Amir Barati Farimani

Figure 1 for SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Figure 2 for SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Figure 3 for SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Figure 4 for SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Abstract:In an era where symbolic mathematical equations are indispensable for modeling complex natural phenomena, scientific inquiry often involves collecting observations and translating them into mathematical expressions. Recently, deep learning has emerged as a powerful tool for extracting insights from data. However, existing models typically specialize in either numeric or symbolic domains, and are usually trained in a supervised manner tailored to specific tasks. This approach neglects the substantial benefits that could arise from a task-agnostic unified understanding between symbolic equations and their numeric counterparts. To bridge the gap, we introduce SNIP, a Symbolic-Numeric Integrated Pre-training, which employs joint contrastive learning between symbolic and numeric domains, enhancing their mutual similarities in the pre-trained embeddings. By performing latent space analysis, we observe that SNIP provides cross-domain insights into the representations, revealing that symbolic supervision enhances the embeddings of numeric data and vice versa. We evaluate SNIP across diverse tasks, including symbolic-to-numeric mathematical property prediction and numeric-to-symbolic equation discovery, commonly known as symbolic regression. Results show that SNIP effectively transfers to various tasks, consistently outperforming fully supervised baselines and competing strongly with established task-specific methods, especially in few-shot learning scenarios where available data is limited.

Via

Access Paper or Ask Questions

Towards Semi-Structured Automatic ICD Coding via Tree-based Contrastive Learning

Oct 14, 2023

Chang Lu, Chandan K. Reddy, Ping Wang, Yue Ning

Abstract:Automatic coding of International Classification of Diseases (ICD) is a multi-label text categorization task that involves extracting disease or procedure codes from clinical notes. Despite the application of state-of-the-art natural language processing (NLP) techniques, there are still challenges including limited availability of data due to privacy constraints and the high variability of clinical notes caused by different writing habits of medical professionals and various pathological features of patients. In this work, we investigate the semi-structured nature of clinical notes and propose an automatic algorithm to segment them into sections. To address the variability issues in existing ICD coding models with limited data, we introduce a contrastive pre-training approach on sections using a soft multi-label similarity metric based on tree edit distance. Additionally, we design a masked section training strategy to enable ICD coding models to locate sections related to ICD codes. Extensive experimental results demonstrate that our proposed training strategies effectively enhance the performance of existing ICD coding methods.

* Accepted by NeurIPS 2023

Via

Access Paper or Ask Questions

SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models

May 19, 2023

Akshita Jha, Aida Davani, Chandan K. Reddy, Shachi Dave, Vinodkumar Prabhakaran, Sunipa Dev

Abstract:Stereotype benchmark datasets are crucial to detect and mitigate social stereotypes about groups of people in NLP models. However, existing datasets are limited in size and coverage, and are largely restricted to stereotypes prevalent in the Western society. This is especially problematic as language technologies gain hold across the globe. To address this gap, we present SeeGULL, a broad-coverage stereotype dataset, built by utilizing generative capabilities of large language models such as PaLM, and GPT-3, and leveraging a globally diverse rater pool to validate the prevalence of those stereotypes in society. SeeGULL is in English, and contains stereotypes about identity groups spanning 178 countries across 8 different geo-political regions across 6 continents, as well as state-level identities within the US and India. We also include fine-grained offensiveness scores for different stereotypes and demonstrate their global disparities. Furthermore, we include comparative annotations about the same groups by annotators living in the region vs. those that are based in North America, and demonstrate that within-region stereotypes about groups differ from those prevalent in North America. CONTENT WARNING: This paper contains stereotype examples that may be offensive.

Via

Access Paper or Ask Questions

Complex Logical Reasoning over Knowledge Graphs using Large Language Models

May 02, 2023

Nurendra Choudhary, Chandan K. Reddy

Abstract:Reasoning over knowledge graphs (KGs) is a challenging task that requires a deep understanding of the complex relationships between entities and the underlying logic of their relations. Current approaches rely on learning geometries to embed entities in vector space for logical query operations, but they suffer from subpar performance on complex queries and dataset-specific representations. In this paper, we propose a novel decoupled approach, Language-guided Abstract Reasoning over Knowledge graphs (LARK), that formulates complex KG reasoning as a combination of contextual KG search and abstract logical query reasoning, to leverage the strengths of graph extraction algorithms and large language models (LLM), respectively. Our experiments demonstrate that the proposed approach outperforms state-of-the-art KG reasoning methods on standard benchmark datasets across several logical query constructs, with significant performance gain for queries of higher complexity. Furthermore, we show that the performance of our approach improves proportionally to the increase in size of the underlying LLM, enabling the integration of the latest advancements in LLMs for logical reasoning over KGs. Our work presents a new direction for addressing the challenges of complex KG reasoning and paves the way for future research in this area.

* Code available at https://github.com/Akirato/LLM-KG-Reasoning

Via

Access Paper or Ask Questions

Identifying TBI Physiological States by Clustering of Multivariate Clinical Time-Series

Mar 30, 2023

Hamid Ghaderi, Brandon Foreman, Amin Nayebi, Sindhu Tipirneni, Chandan K. Reddy, Vignesh Subbian

Figure 1 for Identifying TBI Physiological States by Clustering of Multivariate Clinical Time-Series

Figure 2 for Identifying TBI Physiological States by Clustering of Multivariate Clinical Time-Series

Figure 3 for Identifying TBI Physiological States by Clustering of Multivariate Clinical Time-Series

Figure 4 for Identifying TBI Physiological States by Clustering of Multivariate Clinical Time-Series

Abstract:Determining clinically relevant physiological states from multivariate time series data with missing values is essential for providing appropriate treatment for acute conditions such as Traumatic Brain Injury (TBI), respiratory failure, and heart failure. Utilizing non-temporal clustering or data imputation and aggregation techniques may lead to loss of valuable information and biased analyses. In our study, we apply the SLAC-Time algorithm, an innovative self-supervision-based approach that maintains data integrity by avoiding imputation or aggregation, offering a more useful representation of acute patient states. By using SLAC-Time to cluster data in a large research dataset, we identified three distinct TBI physiological states and their specific feature profiles. We employed various clustering evaluation metrics and incorporated input from a clinical domain expert to validate and interpret the identified physiological states. Further, we discovered how specific clinical events and interventions can influence patient states and state transitions.

* 10 pages, 7 figures, 2 tables

Via

Access Paper or Ask Questions

Transformer-based Planning for Symbolic Regression

Mar 16, 2023

Parshin Shojaee, Kazem Meidani, Amir Barati Farimani, Chandan K. Reddy

Figure 1 for Transformer-based Planning for Symbolic Regression

Figure 2 for Transformer-based Planning for Symbolic Regression

Figure 3 for Transformer-based Planning for Symbolic Regression

Figure 4 for Transformer-based Planning for Symbolic Regression

Abstract:Symbolic regression (SR) is a challenging task in machine learning that involves finding a mathematical expression for a function based on its values. Recent advancements in SR have demonstrated the efficacy of pretrained transformer-based models for generating equations as sequences, which benefit from large-scale pretraining on synthetic datasets and offer considerable advantages over GP-based methods in terms of inference time. However, these models focus on supervised pretraining goals borrowed from text generation and ignore equation-specific objectives like accuracy and complexity. To address this, we propose TPSR, a Transformer-based Planning strategy for Symbolic Regression that incorporates Monte Carlo Tree Search into the transformer decoding process. TPSR, as opposed to conventional decoding strategies, allows for the integration of non-differentiable feedback, such as fitting accuracy and complexity, as external sources of knowledge into the equation generation process. Extensive experiments on various datasets show that our approach outperforms state-of-the-art methods, enhancing the model's fitting-complexity trade-off, extrapolation abilities, and robustness to noise. We also demonstrate that the utilization of various caching mechanisms can further enhance the efficiency of TPSR.

* Parshin Shojaee and Kazem Meidani contributed equally to this work

Via

Access Paper or Ask Questions

A Self-Supervised Learning-based Approach to Clustering Multivariate Time-Series Data with Missing Values : An Application to Traumatic Brain Injury Phenotyping

Feb 27, 2023

Hamid Ghaderi, Brandon Foreman, Amin Nayebi, Sindhu Tipirneni, Chandan K. Reddy, Vignesh Subbian

Figure 1 for A Self-Supervised Learning-based Approach to Clustering Multivariate Time-Series Data with Missing Values : An Application to Traumatic Brain Injury Phenotyping

Figure 2 for A Self-Supervised Learning-based Approach to Clustering Multivariate Time-Series Data with Missing Values : An Application to Traumatic Brain Injury Phenotyping

Figure 3 for A Self-Supervised Learning-based Approach to Clustering Multivariate Time-Series Data with Missing Values : An Application to Traumatic Brain Injury Phenotyping

Figure 4 for A Self-Supervised Learning-based Approach to Clustering Multivariate Time-Series Data with Missing Values : An Application to Traumatic Brain Injury Phenotyping

Abstract:Self-supervised learning approaches provide a promising direction for clustering multivariate time-series data. However, real-world time-series data often include missing values, and the existing approaches require imputing missing values before clustering, which may cause extensive computations and noise and result in invalid interpretations. To address these challenges, we present a Self-supervised Learning-based Approach to Clustering multivariate Time-series data with missing values (SLAC-Time). SLAC-Time is a Transformer-based clustering method that uses time-series forecasting as a proxy task for leveraging unlabeled data and learning more robust time-series representations. This method jointly learns the neural network parameters and the cluster assignments of the learned representations. It iteratively clusters the learned representations with the K-means method and then utilizes the subsequent cluster assignments as pseudo-labels to update the model parameters. To evaluate our proposed approach, we applied it to clustering and phenotyping Traumatic Brain Injury (TBI) patients in the TRACK-TBI dataset. Our experiments demonstrate that SLAC-Time outperforms the baseline K-means clustering algorithm in terms of silhouette coefficient, Calinski Harabasz index, Dunn index, and Davies Bouldin index. We identified three TBI phenotypes that are distinct from one another in terms of clinically significant variables as well as clinical outcomes, including the Extended Glasgow Outcome Scale (GOSE) score, Intensive Care Unit (ICU) length of stay, and mortality rate. The experiments show that the TBI phenotypes identified by SLAC-Time can be potentially used for developing targeted clinical trials and therapeutic strategies.

* Submitted to the Journal of Biomedical Informatics

Via

Access Paper or Ask Questions

Execution-based Code Generation using Deep Reinforcement Learning

Feb 13, 2023

Parshin Shojaee, Aneesh Jain, Sindhu Tipirneni, Chandan K. Reddy

Figure 1 for Execution-based Code Generation using Deep Reinforcement Learning

Figure 2 for Execution-based Code Generation using Deep Reinforcement Learning

Figure 3 for Execution-based Code Generation using Deep Reinforcement Learning

Figure 4 for Execution-based Code Generation using Deep Reinforcement Learning

Abstract:The utilization of programming language (PL) models, pretrained on large-scale code corpora, as a means of automating software engineering processes has demonstrated considerable potential in streamlining various code generation tasks such as code completion, code translation, and program synthesis. However, current approaches mainly rely on supervised fine-tuning objectives borrowed from text generation, neglecting specific sequence-level features of code, including but not limited to compilability as well as syntactic and functional correctness. To address this limitation, we propose PPOCoder, a new framework for code generation that combines pretrained PL models with Proximal Policy Optimization (PPO) deep reinforcement learning and employs execution feedback as the external source of knowledge into the model optimization. PPOCoder is transferable across different code generation tasks and PLs. Extensive experiments on three code generation tasks demonstrate the effectiveness of our proposed approach compared to SOTA methods, improving the success rate of compilation and functional correctness over different PLs. Our code can be found at https://github.com/reddy-lab-code-research/PPOCoder .

Via

Access Paper or Ask Questions

Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis

Feb 07, 2023

Akshita Jha, Adithya Samavedhi, Vineeth Rakesh, Jaideep Chandrashekar, Chandan K. Reddy

Figure 1 for Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis

Figure 2 for Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis

Figure 3 for Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis

Figure 4 for Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis

Abstract:Recent advances in the area of long document matching have primarily focused on using transformer-based models for long document encoding and matching. There are two primary challenges associated with these models. Firstly, the performance gain provided by transformer-based models comes at a steep cost - both in terms of the required training time and the resource (memory and energy) consumption. The second major limitation is their inability to handle more than a pre-defined input token length at a time. In this work, we empirically demonstrate the effectiveness of simple neural models (such as feed-forward networks, and CNNs) and simple embeddings (like GloVe, and Paragraph Vector) over transformer-based models on the task of document matching. We show that simple models outperform the more complex BERT-based models while taking significantly less training time, energy, and memory. The simple models are also more robust to variations in document length and text perturbations.

Via

Access Paper or Ask Questions