Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yunsie Chung

Multitask finetuning and acceleration of chemical pretrained models for small molecule drug property prediction

Oct 14, 2025

Matthew Adrian, Yunsie Chung, Kevin Boyd, Saee Paliwal, Srimukh Prasad Veccham, Alan C. Cheng

Abstract:Chemical pretrained models, sometimes referred to as foundation models, are receiving considerable interest for drug discovery applications. The general chemical knowledge extracted from self-supervised training has the potential to improve predictions for critical drug discovery endpoints, including on-target potency and ADMET properties. Multi-task learning has previously been successfully leveraged to improve predictive models. Here, we show that enabling multitasking in finetuning of chemical pretrained graph neural network models such as Kinetic GROVER Multi-Task (KERMT), an enhanced version of the GROVER model, and Knowledge-guided Pre-training of Graph Transformer (KGPT) significantly improves performance over non-pretrained graph neural network models. Surprisingly, we find that the performance improvement from finetuning KERMT in a multitask manner is most significant at larger data sizes. Additionally, we publish two multitask ADMET data splits to enable more accurate benchmarking of multitask deep learning methods for drug property prediction. Finally, we provide an accelerated implementation of the KERMT model on GitHub, unlocking large-scale pretraining, finetuning, and inference in industrial drug discovery workflows.

Via

Access Paper or Ask Questions

QComp: A QSAR-Based Data Completion Framework for Drug Discovery

May 20, 2024

Bingjia Yang, Yunsie Chung, Archer Y. Yang, Bo Yuan, Xiang Yu

Figure 1 for QComp: A QSAR-Based Data Completion Framework for Drug Discovery

Figure 2 for QComp: A QSAR-Based Data Completion Framework for Drug Discovery

Figure 3 for QComp: A QSAR-Based Data Completion Framework for Drug Discovery

Figure 4 for QComp: A QSAR-Based Data Completion Framework for Drug Discovery

Abstract:In drug discovery, in vitro and in vivo experiments reveal biochemical activities related to the efficacy and toxicity of compounds. The experimental data accumulate into massive, ever-evolving, and sparse datasets. Quantitative Structure-Activity Relationship (QSAR) models, which predict biochemical activities using only the structural information of compounds, face challenges in integrating the evolving experimental data as studies progress. We develop QSAR-Complete (QComp), a data completion framework to address this issue. Based on pre-existing QSAR models, QComp utilizes the correlation inherent in experimental data to enhance prediction accuracy across various tasks. Moreover, QComp emerges as a promising tool for guiding the optimal sequence of experiments by quantifying the reduction in statistical uncertainty for specific endpoints, thereby aiding in rational decision-making throughout the drug discovery process.

Via

Access Paper or Ask Questions