Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shiyun Wa

Advancing Ligand-based Virtual Screening and Molecular Generation with Pretrained Molecular Embedding Distance

Apr 27, 2026

Shiyun Wa, Yifei Wang, Simone Sciabola, Ye Wang

Abstract:Molecular similarity plays a central role in ligand-based drug discovery, such as virtual screening, analog searching, and goal-directed molecular generation. However, traditional similarity measures, ranging from fingerprint-based Tanimoto coefficients to 3D shape overlays, are often computationally expensive at scale or rely on hand-crafted molecular descriptors. Meanwhile, many deep learning approaches to similarity-aware design still depend on similarity-specific supervision or costly data curation, limiting their generality across targets. In this work, we propose pretrained embedding distance (PED) as an effective alternative, computed directly from pretrained molecular models without task-specific training. Experimental results show that PED exhibits distinct correlations with traditional similarity metrics, and performs effectively in both ranking molecules for virtual screening and guiding molecular generation via reward design. These findings suggest that pretrained molecular embeddings capture rich structural information and can serve as a promising and scalable similarity measurement for modern AI-aided drug discovery.

* 26 pages, 12 figures, 9 tables

Via

Access Paper or Ask Questions

Stable and Interpretable Deep Learning for Tabular Data: Introducing InterpreTabNet with the Novel InterpreStability Metric

Oct 04, 2023

Shiyun Wa, Xinai Lu, Minjuan Wang

Figure 1 for Stable and Interpretable Deep Learning for Tabular Data: Introducing InterpreTabNet with the Novel InterpreStability Metric

Figure 2 for Stable and Interpretable Deep Learning for Tabular Data: Introducing InterpreTabNet with the Novel InterpreStability Metric

Figure 3 for Stable and Interpretable Deep Learning for Tabular Data: Introducing InterpreTabNet with the Novel InterpreStability Metric

Figure 4 for Stable and Interpretable Deep Learning for Tabular Data: Introducing InterpreTabNet with the Novel InterpreStability Metric

Abstract:As Artificial Intelligence (AI) integrates deeper into diverse sectors, the quest for powerful models has intensified. While significant strides have been made in boosting model capabilities and their applicability across domains, a glaring challenge persists: many of these state-of-the-art models remain as black boxes. This opacity not only complicates the explanation of model decisions to end-users but also obstructs insights into intermediate processes for model designers. To address these challenges, we introduce InterpreTabNet, a model designed to enhance both classification accuracy and interpretability by leveraging the TabNet architecture with an improved attentive module. This design ensures robust gradient propagation and computational stability. Additionally, we present a novel evaluation metric, InterpreStability, which quantifies the stability of a model's interpretability. The proposed model and metric mark a significant stride forward in explainable models' research, setting a standard for transparency and interpretability in AI model design and application across diverse sectors. InterpreTabNet surpasses other leading solutions in tabular data analysis across varied application scenarios, paving the way for further research into creating deep-learning models that are both highly accurate and inherently explainable. The introduction of the InterpreStability metric ensures that the interpretability of future models can be measured and compared in a consistent and rigorous manner. Collectively, these contributions have the potential to promote the design principles and development of next-generation interpretable AI models, widening the adoption of interpretable AI solutions in critical decision-making environments.

* 34 pages, 7 figures, 8 tables

Via

Access Paper or Ask Questions