Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sam Money-Kyrle

On Improving Graph Neural Networks for QSAR by Pre-training on Extended-Connectivity Fingerprints

May 11, 2026

Sam Money-Kyrle, Markus Dablander, Thierry Hanser, Stephane Werner, Charlotte M. Deane, Garrett M. Morris

Abstract:Molecular Graph Neural Networks (GNNs) are increasingly common in drug discovery, particularly for Quantitative Structure-Activity Relationship (QSAR) studies; yet, their superiority compared to classical molecular featurisation approaches is disputed. We report a general strategy for improving GNNs for QSAR by pre-training to predict Extended-Connectivity Fingerprints (ECFP). We validate our approach with statistical tests and challenging out-of-distribution (OOD) splits. Across five out of six Biogen benchmarks, we observed a statistically significant improvement in standard performance metrics over all evaluated baselines when using ECFP pre-trained GNNs. However, for more heterogeneous datasets and more complex endpoints, such as binding affinity prediction, pre-trained GNNs underperformed in OOD settings. Importantly, we investigated the impact of substructure-level data leakage during pre-training on downstream performance. While we identified scenarios where pre-training on ECFPs was less effective, our findings show that ECFP-based pre-training can enhance downstream OOD performance on a diverse set of practically relevant QSAR tasks.

Via

Access Paper or Ask Questions

Predicting protein stability changes under multiple amino acid substitutions using equivariant graph neural networks

May 30, 2023

Sebastien Boyer, Sam Money-Kyrle, Oliver Bent

Figure 1 for Predicting protein stability changes under multiple amino acid substitutions using equivariant graph neural networks

Figure 2 for Predicting protein stability changes under multiple amino acid substitutions using equivariant graph neural networks

Figure 3 for Predicting protein stability changes under multiple amino acid substitutions using equivariant graph neural networks

Figure 4 for Predicting protein stability changes under multiple amino acid substitutions using equivariant graph neural networks

Abstract:The accurate prediction of changes in protein stability under multiple amino acid substitutions is essential for realising true in-silico protein re-design. To this purpose, we propose improvements to state-of-the-art Deep learning (DL) protein stability prediction models, enabling first-of-a-kind predictions for variable numbers of amino acid substitutions, on structural representations, by decoupling the atomic and residue scales of protein representations. This was achieved using E(3)-equivariant graph neural networks (EGNNs) for both atomic environment (AE) embedding and residue-level scoring tasks. Our AE embedder was used to featurise a residue-level graph, then trained to score mutant stability ($\Delta\Delta G$). To achieve effective training of this predictive EGNN we have leveraged the unprecedented scale of a new high-throughput protein stability experimental data-set, Mega-scale. Finally, we demonstrate the immediately promising results of this procedure, discuss the current shortcomings, and highlight potential future strategies.

Via

Access Paper or Ask Questions