Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuan Lin

Softmax Linear Attention: Reclaiming Global Competition

Feb 02, 2026

Mingwei Xu, Xuan Lin, Xinnan Guo, Wanqing Xu, Wanyun Cui

Abstract:While linear attention reduces the quadratic complexity of standard Transformers to linear time, it often lags behind in expressivity due to the removal of softmax normalization. This omission eliminates \emph{global competition}, a critical mechanism that enables models to sharply focus on relevant information amidst long-context noise. In this work, we propose \textbf{Softmax Linear Attention (SLA)}, a framework designed to restore this competitive selection without sacrificing efficiency. By lifting the softmax operation from the token level to the head level, SLA leverages attention heads as coarse semantic slots, applying a competitive gating mechanism to dynamically select the most relevant subspaces. This reintroduces the ``winner-take-all'' dynamics essential for precise retrieval and robust long-context understanding. Distinct from prior methods that focus on refining local kernel functions, SLA adopts a broader perspective by exploiting the higher-level multi-head aggregation structure. Extensive experiments demonstrate that SLA consistently enhances state-of-the-art linear baselines (RetNet, GLA, GDN) across language modeling and long-context benchmarks, particularly in challenging retrieval scenarios where it significantly boosts robustness against noise, validating its capability to restore precise focus while maintaining linear complexity.

* 11 pages,4 figures

Via

Access Paper or Ask Questions

MolBridge: Atom-Level Joint Graph Refinement for Robust Drug-Drug Interaction Event Prediction

Oct 23, 2025

Xuan Lin, Aocheng Ding, Tengfei Ma, Hua Liang, Zhe Quan

Abstract:Drug combinations offer therapeutic benefits but also carry the risk of adverse drug-drug interactions (DDIs), especially under complex molecular structures. Accurate DDI event prediction requires capturing fine-grained inter-drug relationships, which are critical for modeling metabolic mechanisms such as enzyme-mediated competition. However, existing approaches typically rely on isolated drug representations and fail to explicitly model atom-level cross-molecular interactions, limiting their effectiveness across diverse molecular complexities and DDI type distributions. To address these limitations, we propose MolBridge, a novel atom-level joint graph refinement framework for robust DDI event prediction. MolBridge constructs a joint graph that integrates atomic structures of drug pairs, enabling direct modeling of inter-drug associations. A central challenge in such joint graph settings is the potential loss of information caused by over-smoothing when modeling long-range atomic dependencies. To overcome this, we introduce a structure consistency module that iteratively refines node features while preserving the global structural context. This joint design allows MolBridge to effectively learn both local and global interaction outperforms state-of-the-art baselines, achieving superior performance across long-tail and inductive scenarios. patterns, yielding robust representations across both frequent and rare DDI types. Extensive experiments on two benchmark datasets show that MolBridge consistently. These results demonstrate the advantages of fine-grained graph refinement in improving the accuracy, robustness, and mechanistic interpretability of DDI event prediction.This work contributes to Web Mining and Content Analysis by developing graph-based methods for mining and analyzing drug-drug interaction networks.

Via

Access Paper or Ask Questions

DynamicDTA: Drug-Target Binding Affinity Prediction Using Dynamic Descriptors and Graph Representation

May 13, 2025

Dan Luo, Jinyu Zhou, Le Xu, Sisi Yuan, Xuan Lin

Abstract:Predicting drug-target binding affinity (DTA) is essential for identifying potential therapeutic candidates in drug discovery. However, most existing models rely heavily on static protein structures, often overlooking the dynamic nature of proteins, which is crucial for capturing conformational flexibility that will be beneficial for protein binding interactions. We introduce DynamicDTA, an innovative deep learning framework that incorporates static and dynamic protein features to enhance DTA prediction. The proposed DynamicDTA takes three types of inputs, including drug sequence, protein sequence, and dynamic descriptors. A molecular graph representation of the drug sequence is generated and subsequently processed through graph convolutional network, while the protein sequence is encoded using dilated convolutions. Dynamic descriptors, such as root mean square fluctuation, are processed through a multi-layer perceptron. These embedding features are fused with static protein features using cross-attention, and a tensor fusion network integrates all three modalities for DTA prediction. Extensive experiments on three datasets demonstrate that DynamicDTA achieves by at least 3.4% improvement in RMSE score with comparison to seven state-of-the-art baseline methods. Additionally, predicting novel drugs for Human Immunodeficiency Virus Type 1 and visualizing the docking complexes further demonstrates the reliability and biological relevance of DynamicDTA.

* Accepted for publication at Interdisciplinary Sciences: Computational Life Sciences, 2025

Via

Access Paper or Ask Questions

Enhancing Chemical Reaction and Retrosynthesis Prediction with Large Language Model and Dual-task Learning

May 05, 2025

Xuan Lin, Qingrui Liu, Hongxin Xiang, Daojian Zeng, Xiangxiang Zeng

Figure 1 for Enhancing Chemical Reaction and Retrosynthesis Prediction with Large Language Model and Dual-task Learning

Figure 2 for Enhancing Chemical Reaction and Retrosynthesis Prediction with Large Language Model and Dual-task Learning

Figure 3 for Enhancing Chemical Reaction and Retrosynthesis Prediction with Large Language Model and Dual-task Learning

Figure 4 for Enhancing Chemical Reaction and Retrosynthesis Prediction with Large Language Model and Dual-task Learning

Abstract:Chemical reaction and retrosynthesis prediction are fundamental tasks in drug discovery. Recently, large language models (LLMs) have shown potential in many domains. However, directly applying LLMs to these tasks faces two major challenges: (i) lacking a large-scale chemical synthesis-related instruction dataset; (ii) ignoring the close correlation between reaction and retrosynthesis prediction for the existing fine-tuning strategies. To address these challenges, we propose ChemDual, a novel LLM framework for accurate chemical synthesis. Specifically, considering the high cost of data acquisition for reaction and retrosynthesis, ChemDual regards the reaction-and-retrosynthesis of molecules as a related recombination-and-fragmentation process and constructs a large-scale of 4.4 million instruction dataset. Furthermore, ChemDual introduces an enhanced LLaMA, equipped with a multi-scale tokenizer and dual-task learning strategy, to jointly optimize the process of recombination and fragmentation as well as the tasks between reaction and retrosynthesis prediction. Extensive experiments on Mol-Instruction and USPTO-50K datasets demonstrate that ChemDual achieves state-of-the-art performance in both predictions of reaction and retrosynthesis, outperforming the existing conventional single-task approaches and the general open-source LLMs. Through molecular docking analysis, ChemDual generates compounds with diverse and strong protein binding affinity, further highlighting its strong potential in drug design.

* Accepted for publication at IJCAI 2025

Via

Access Paper or Ask Questions

Property Enhanced Instruction Tuning for Multi-task Molecule Generation with Large Language Models

Dec 24, 2024

Xuan Lin, Long Chen, Yile Wang, Xiangxiang Zeng, Philip S. Yu

Abstract:Large language models (LLMs) are widely applied in various natural language processing tasks such as question answering and machine translation. However, due to the lack of labeled data and the difficulty of manual annotation for biochemical properties, the performance for molecule generation tasks is still limited, especially for tasks involving multi-properties constraints. In this work, we present a two-step framework PEIT (Property Enhanced Instruction Tuning) to improve LLMs for molecular-related tasks. In the first step, we use textual descriptions, SMILES, and biochemical properties as multimodal inputs to pre-train a model called PEIT-GEN, by aligning multi-modal representations to synthesize instruction data. In the second step, we fine-tune existing open-source LLMs with the synthesized data, the resulting PEIT-LLM can handle molecule captioning, text-based molecule generation, molecular property prediction, and our newly proposed multi-constraint molecule generation tasks. Experimental results show that our pre-trained PEIT-GEN outperforms MolT5 and BioT5 in molecule captioning, demonstrating modalities align well between textual descriptions, structures, and biochemical properties. Furthermore, PEIT-LLM shows promising improvements in multi-task molecule generation, proving the scalability of the PEIT framework for various molecular tasks. We release the code, constructed instruction data, and model checkpoints in https://github.com/chenlong164/PEIT.

Via

Access Paper or Ask Questions

S$^2$DN: Learning to Denoise Unconvincing Knowledge for Inductive Knowledge Graph Completion

Dec 20, 2024

Tengfei Ma, Yujie Chen, Liang Wang, Xuan Lin, Bosheng Song, Xiangxiang Zeng

Abstract:Inductive Knowledge Graph Completion (KGC) aims to infer missing facts between newly emerged entities within knowledge graphs (KGs), posing a significant challenge. While recent studies have shown promising results in inferring such entities through knowledge subgraph reasoning, they suffer from (i) the semantic inconsistencies of similar relations, and (ii) noisy interactions inherent in KGs due to the presence of unconvincing knowledge for emerging entities. To address these challenges, we propose a Semantic Structure-aware Denoising Network (S$^2$DN) for inductive KGC. Our goal is to learn adaptable general semantics and reliable structures to distill consistent semantic knowledge while preserving reliable interactions within KGs. Specifically, we introduce a semantic smoothing module over the enclosing subgraphs to retain the universal semantic knowledge of relations. We incorporate a structure refining module to filter out unreliable interactions and offer additional knowledge, retaining robust structure surrounding target links. Extensive experiments conducted on three benchmark KGs demonstrate that S$^2$DN surpasses the performance of state-of-the-art models. These results demonstrate the effectiveness of S$^2$DN in preserving semantic consistency and enhancing the robustness of filtering out unreliable interactions in contaminated KGs.

* 15 pages

Via

Access Paper or Ask Questions

Y-Mol: A Multiscale Biomedical Knowledge-Guided Large Language Model for Drug Development

Oct 15, 2024

Tengfei Ma, Xuan Lin, Tianle Li, Chaoyi Li, Long Chen, Peng Zhou, Xibao Cai, Xinyu Yang, Daojian Zeng, Dongsheng Cao(+1 more)

Abstract:Large Language Models (LLMs) have recently demonstrated remarkable performance in general tasks across various fields. However, their effectiveness within specific domains such as drug development remains challenges. To solve these challenges, we introduce \textbf{Y-Mol}, forming a well-established LLM paradigm for the flow of drug development. Y-Mol is a multiscale biomedical knowledge-guided LLM designed to accomplish tasks across lead compound discovery, pre-clinic, and clinic prediction. By integrating millions of multiscale biomedical knowledge and using LLaMA2 as the base LLM, Y-Mol augments the reasoning capability in the biomedical domain by learning from a corpus of publications, knowledge graphs, and expert-designed synthetic data. The capability is further enriched with three types of drug-oriented instructions: description-based prompts from processed publications, semantic-based prompts for extracting associations from knowledge graphs, and template-based prompts for understanding expert knowledge from biomedical tools. Besides, Y-Mol offers a set of LLM paradigms that can autonomously execute the downstream tasks across the entire process of drug development, including virtual screening, drug design, pharmacological properties prediction, and drug-related interaction prediction. Our extensive evaluations of various biomedical sources demonstrate that Y-Mol significantly outperforms general-purpose LLMs in discovering lead compounds, predicting molecular properties, and identifying drug interaction events.

* 12 pages, Under Review

Via

Access Paper or Ask Questions

Optimization-based Task and Motion Planning under Signal Temporal Logic Specifications using Logic Network Flow

Sep 27, 2024

Xuan Lin, Jiming Ren, Samuel Coogan, Ye Zhao

Abstract:This paper proposes an optimization-based task and motion planning framework, named ``Logic Network Flow", to integrate signal temporal logic (STL) specifications into efficient mixed-binary linear programmings. In this framework, temporal predicates are encoded as polyhedron constraints on each edge of the network flow, instead of as constraints between the nodes as in the traditional Logic Tree formulation. Synthesized with Dynamic Network Flows, Logic Network Flows render a tighter convex relaxation compared to Logic Trees derived from these STL specifications. Our formulation is evaluated on several multi-robot motion planning case studies. Empirical results demonstrate that our formulation outperforms Logic Tree formulation in terms of computation time for several planning problems. As the problem size scales up, our method still discovers better lower and upper bounds by exploring fewer number of nodes during the branch-and-bound process, although this comes at the cost of increased computational load for each node when exploring branches.

Via

Access Paper or Ask Questions

Evaluating Data-driven Performances of Mixed Integer Bilinear Formulations for Book Placement Planning

Jun 07, 2024

Xuan Lin, Gabriel Ikaika Fernandez, Dennis Hong

Abstract:Mixed integer bilinear programs (MIBLPs) offer tools to resolve robotics motion planning problems with orthogonal rotation matrices or static moment balance, but require long solving times. Recent work utilizing data-driven methods has shown potential to overcome this issue allowing for applications on larger scale problems. To solve mixed-integer bilinear programs online with data-driven methods, several re-formulations exist including mathematical programming with complementary constraints (MPCC), and mixed-integer programming (MIP). In this work, we compare the data-driven performances of various MIBLP reformulations using a book placement problem that has discrete configuration switches and bilinear constraints. The success rate, cost, and solving time are compared along with non-data-driven methods. Our results demonstrate the advantage of using data-driven methods to accelerate the solving speed of MIBLPs, and provide references for users to choose the suitable re-formulation.

* UR 2024 accepted. arXiv admin note: substantial text overlap with arXiv:2208.13158

Via

Access Paper or Ask Questions

Accelerate Hybrid Model Predictive Control using Generalized Benders Decomposition

Jun 02, 2024

Xuan Lin

Abstract:Hybrid model predictive control with both continuous and discrete variables is widely applicable to robotics tasks. Due to the combinatorial complexity, the solving speed of hybrid MPC can be insufficient for real-time applications. In this paper, we propose to accelerate hybrid MPC using Generalized Benders Decomposition (GBD). GBD enumerates cuts online and stores inside a finite buffer to provide warm-starts for the new problem instances. Leveraging on the sparsity of feasibility cuts, a fast algorithm is designed for Benders master problems. We also propose to construct initial optimality cuts from heuristic solutions allowing GBD to plan for longer time horizons. The proposed algorithm successfully controls a cart-pole system with randomly moving soft-contact walls reaching speeds 2-3 times faster than Gurobi, oftentimes exceeding 1000Hz. It also guides a free-flying robot through a maze with a time horizon of 50 re-planning at 20Hz. The code is available at https://github.com/XuanLin/Benders-MPC.

* arXiv admin note: substantial text overlap with arXiv:2401.00917

Via

Access Paper or Ask Questions