Alert button
Picture for Jingzhou He

Jingzhou He

Alert button

GEM-2: Next Generation Molecular Property Prediction Network with Many-body and Full-range Interaction Modeling

Aug 15, 2022
Lihang Liu, Donglong He, Xiaomin Fang, Shanzhuo Zhang, Fan Wang, Jingzhou He, Hua Wu

Figure 1 for GEM-2: Next Generation Molecular Property Prediction Network with Many-body and Full-range Interaction Modeling
Figure 2 for GEM-2: Next Generation Molecular Property Prediction Network with Many-body and Full-range Interaction Modeling
Figure 3 for GEM-2: Next Generation Molecular Property Prediction Network with Many-body and Full-range Interaction Modeling
Figure 4 for GEM-2: Next Generation Molecular Property Prediction Network with Many-body and Full-range Interaction Modeling

Molecular property prediction is a fundamental task in the drug and material industries. Physically, the properties of a molecule are determined by its own electronic structure, which can be exactly described by the Schr\"odinger equation. However, solving the Schr\"odinger equation for most molecules is extremely challenging due to long-range interactions in the behavior of a quantum many-body system. While deep learning methods have proven to be effective in molecular property prediction, we design a novel method, namely GEM-2, which comprehensively considers both the long-range and many-body interactions in molecules. GEM-2 consists of two interacted tracks: an atom-level track modeling both the local and global correlation between any two atoms, and a pair-level track modeling the correlation between all atom pairs, which embed information between any 3 or 4 atoms. Extensive experiments demonstrated the superiority of GEM-2 over multiple baseline methods in quantum chemistry and drug discovery tasks.

Viaarxiv icon

HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative

Aug 09, 2022
Xiaomin Fang, Fan Wang, Lihang Liu, Jingzhou He, Dayong Lin, Yingfei Xiang, Xiaonan Zhang, Hua Wu, Hui Li, Le Song

Figure 1 for HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative
Figure 2 for HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative
Figure 3 for HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative
Figure 4 for HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative

AI-based protein structure prediction pipelines, such as AlphaFold2, have achieved near-experimental accuracy. These advanced pipelines mainly rely on Multiple Sequence Alignments (MSAs) as inputs to learn the co-evolution information from the homologous sequences. Nonetheless, searching MSAs from protein databases is time-consuming, usually taking dozens of minutes. Consequently, we attempt to explore the limits of fast protein structure prediction by using only primary sequences of proteins. HelixFold-Single is proposed to combine a large-scale protein language model with the superior geometric learning capability of AlphaFold2. Our proposed method, HelixFold-Single, first pre-trains a large-scale protein language model (PLM) with thousands of millions of primary sequences utilizing the self-supervised learning paradigm, which will be used as an alternative to MSAs for learning the co-evolution information. Then, by combining the pre-trained PLM and the essential components of AlphaFold2, we obtain an end-to-end differentiable model to predict the 3D coordinates of atoms from only the primary sequence. HelixFold-Single is validated in datasets CASP14 and CAMEO, achieving competitive accuracy with the MSA-based methods on the targets with large homologous families. Furthermore, HelixFold-Single consumes much less time than the mainstream pipelines for protein structure prediction, demonstrating its potential in tasks requiring many predictions. The code of HelixFold-Single is available at https://github.com/PaddlePaddle/PaddleHelix/tree/dev/apps/protein_folding/helixfold-single, and we also provide stable web services on https://paddlehelix.baidu.com/app/drug/protein-single/forecast.

Viaarxiv icon