Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haoyi Xiong

Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity

Jul 21, 2021

Shuangli Li, Jingbo Zhou, Tong Xu, Liang Huang, Fan Wang, Haoyi Xiong, Weili Huang, Dejing Dou, Hui Xiong

Figure 1 for Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity

Figure 2 for Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity

Figure 3 for Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity

Figure 4 for Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity

Abstract:Drug discovery often relies on the successful prediction of protein-ligand binding affinity. Recent advances have shown great promise in applying graph neural networks (GNNs) for better affinity prediction by learning the representations of protein-ligand complexes. However, existing solutions usually treat protein-ligand complexes as topological graph data, thus the biomolecular structural information is not fully utilized. The essential long-range interactions among atoms are also neglected in GNN models. To this end, we propose a structure-aware interactive graph neural network (SIGN) which consists of two components: polar-inspired graph attention layers (PGAL) and pairwise interactive pooling (PiPool). Specifically, PGAL iteratively performs the node-edge aggregation process to update embeddings of nodes and edges while preserving the distance and angle information among atoms. Then, PiPool is adopted to gather interactive edges with a subsequent reconstruction loss to reflect the global interactions. Exhaustive experimental study on two benchmarks verifies the superiority of SIGN.

* 11 pages, 8 figures, Accepted by KDD 2021 (Research Track)

Via

Access Paper or Ask Questions

Face.evoLVe: A High-Performance Face Recognition Library

Jul 20, 2021

Qingzhong Wang, Pengfei Zhang, Haoyi Xiong, Jian Zhao

Figure 1 for Face.evoLVe: A High-Performance Face Recognition Library

Figure 2 for Face.evoLVe: A High-Performance Face Recognition Library

Figure 3 for Face.evoLVe: A High-Performance Face Recognition Library

Figure 4 for Face.evoLVe: A High-Performance Face Recognition Library

Abstract:In this paper, we develop face.evoLVe -- a comprehensive library that collects and implements a wide range of popular deep learning-based methods for face recognition. First of all, face.evoLVe is composed of key components that cover the full process of face analytics, including face alignment, data processing, various backbones, losses, and alternatives with bags of tricks for improving performance. Later, face.evoLVe supports multi-GPU training on top of different deep learning platforms, such as PyTorch and PaddlePaddle, which facilitates researchers to work on both large-scale datasets with millions of images and low-shot counterparts with limited well-annotated data. More importantly, along with face.evoLVe, images before & after alignment in the common benchmark datasets are released with source codes and trained models provided. All these efforts lower the technical burdens in reproducing the existing methods for comparison, while users of our library could focus on developing advanced approaches more efficiently. Last but not least, face.evoLVe is well designed and vibrantly evolving, so that new face recognition approaches can be easily plugged into our framework. Note that we have used face.evoLVe to participate in a number of face recognition competitions and secured the first place. The version that supports PyTorch is publicly available at https://github.com/ZhaoJ9014/face.evoLVe.PyTorch and the PaddlePaddle version is available at https://github.com/ZhaoJ9014/face.evoLVe.PyTorch/tree/master/paddle. Face.evoLVe has been widely used for face analytics, receiving 2.4K stars and 622 forks.

Via

Access Paper or Ask Questions

Robust Matrix Factorization with Grouping Effect

Jul 08, 2021

Haiyan Jiang, Shuyu Li, Luwei Zhang, Haoyi Xiong, Dejing Dou

Figure 1 for Robust Matrix Factorization with Grouping Effect

Figure 2 for Robust Matrix Factorization with Grouping Effect

Figure 3 for Robust Matrix Factorization with Grouping Effect

Figure 4 for Robust Matrix Factorization with Grouping Effect

Abstract:Although many techniques have been applied to matrix factorization (MF), they may not fully exploit the feature structure. In this paper, we incorporate the grouping effect into MF and propose a novel method called Robust Matrix Factorization with Grouping effect (GRMF). The grouping effect is a generalization of the sparsity effect, which conducts denoising by clustering similar values around multiple centers instead of just around 0. Compared with existing algorithms, the proposed GRMF can automatically learn the grouping structure and sparsity in MF without prior knowledge, by introducing a naturally adjustable non-convex regularization to achieve simultaneous sparsity and grouping effect. Specifically, GRMF uses an efficient alternating minimization framework to perform MF, in which the original non-convex problem is first converted into a convex problem through Difference-of-Convex (DC) programming, and then solved by Alternating Direction Method of Multipliers (ADMM). In addition, GRMF can be easily extended to the Non-negative Matrix Factorization (NMF) settings. Extensive experiments have been conducted using real-world data sets with outliers and contaminated noise, where the experimental results show that GRMF has promoted performance and robustness, compared to five benchmark algorithms.

* 22 pages, 5 figures, 4 tables

Via

Access Paper or Ask Questions

From Personalized Medicine to Population Health: A Survey of mHealth Sensing Techniques

Jul 02, 2021

Zhiyuan Wang, Haoyi Xiong, Jie Zhang, Sijia Yang, Mehdi Boukhechba, Laura E. Barnes, Daqing Zhang

Figure 1 for From Personalized Medicine to Population Health: A Survey of mHealth Sensing Techniques

Figure 2 for From Personalized Medicine to Population Health: A Survey of mHealth Sensing Techniques

Figure 3 for From Personalized Medicine to Population Health: A Survey of mHealth Sensing Techniques

Figure 4 for From Personalized Medicine to Population Health: A Survey of mHealth Sensing Techniques

Abstract:Mobile Sensing Apps have been widely used as a practical approach to collect behavioral and health-related information from individuals and provide timely intervention to promote health and well-beings, such as mental health and chronic cares. As the objectives of mobile sensing could be either \emph{(a) personalized medicine for individuals} or \emph{(b) public health for populations}, in this work we review the design of these mobile sensing apps, and propose to categorize the design of these apps/systems in two paradigms -- \emph{(i) Personal Sensing} and \emph{(ii) Crowd Sensing} paradigms. While both sensing paradigms might incorporate with common ubiquitous sensing technologies, such as wearable sensors, mobility monitoring, mobile data offloading, and/or cloud-based data analytics to collect and process sensing data from individuals, we present a novel taxonomy system with two major components that can specify and classify apps/systems from aspects of the life-cycle of mHealth Sensing: \emph{(1) Sensing Task Creation \& Participation}, \emph{(2) Health Surveillance \& Data Collection}, and \emph{(3) Data Analysis \& Knowledge Discovery}. With respect to different goals of the two paradigms, this work systematically reviews this field, and summarizes the design of typical apps/systems in the view of the configurations and interactions between these two components. In addition to summarization, the proposed taxonomy system also helps figure out the potential directions of mobile sensing for health from both personalized medicines and population health perspectives.

* Submitted to a journal for review

Via

Access Paper or Ask Questions

Practical Assessment of Generalization Performance Robustness for Deep Networks via Contrastive Examples

Jun 20, 2021

Xuanyu Wu, Xuhong Li, Haoyi Xiong, Xiao Zhang, Siyu Huang, Dejing Dou

Figure 1 for Practical Assessment of Generalization Performance Robustness for Deep Networks via Contrastive Examples

Figure 2 for Practical Assessment of Generalization Performance Robustness for Deep Networks via Contrastive Examples

Figure 3 for Practical Assessment of Generalization Performance Robustness for Deep Networks via Contrastive Examples

Figure 4 for Practical Assessment of Generalization Performance Robustness for Deep Networks via Contrastive Examples

Abstract:Training images with data transformations have been suggested as contrastive examples to complement the testing set for generalization performance evaluation of deep neural networks (DNNs). In this work, we propose a practical framework ContRE (The word "contre" means "against" or "versus" in French.) that uses Contrastive examples for DNN geneRalization performance Estimation. Specifically, ContRE follows the assumption in contrastive learning that robust DNN models with good generalization performance are capable of extracting a consistent set of features and making consistent predictions from the same image under varying data transformations. Incorporating with a set of randomized strategies for well-designed data transformations over the training set, ContRE adopts classification errors and Fisher ratios on the generated contrastive examples to assess and analyze the generalization performance of deep models in complement with a testing set. To show the effectiveness and the efficiency of ContRE, extensive experiments have been done using various DNN models on three open source benchmark datasets with thorough ablation studies and applicability analyses. Our experiment results confirm that (1) behaviors of deep models on contrastive examples are strongly correlated to what on the testing set, and (2) ContRE is a robust measure of generalization performance complementing to the testing set in various settings.

Via

Access Paper or Ask Questions

Optimization Variance: Exploring Generalization Properties of DNNs

Jun 03, 2021

Xiao Zhang, Dongrui Wu, Haoyi Xiong, Bo Dai

Figure 1 for Optimization Variance: Exploring Generalization Properties of DNNs

Figure 2 for Optimization Variance: Exploring Generalization Properties of DNNs

Figure 3 for Optimization Variance: Exploring Generalization Properties of DNNs

Figure 4 for Optimization Variance: Exploring Generalization Properties of DNNs

Abstract:Unlike the conventional wisdom in statistical learning theory, the test error of a deep neural network (DNN) often demonstrates double descent: as the model complexity increases, it first follows a classical U-shaped curve and then shows a second descent. Through bias-variance decomposition, recent studies revealed that the bell-shaped variance is the major cause of model-wise double descent (when the DNN is widened gradually). This paper investigates epoch-wise double descent, i.e., the test error of a DNN also shows double descent as the number of training epoches increases. By extending the bias-variance analysis to epoch-wise double descent of the zero-one loss, we surprisingly find that the variance itself, without the bias, varies consistently with the test error. Inspired by this result, we propose a novel metric, optimization variance (OV), to measure the diversity of model updates caused by the stochastic gradients of random training batches drawn in the same iteration. OV can be estimated using samples from the training set only but correlates well with the (unknown) \emph{test} error, and hence early stopping may be achieved without using a validation set.

* Work in progress

Via

Access Paper or Ask Questions

JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

Jun 03, 2021

Hao Liu, Qian Gao, Jiang Li, Xiaochao Liao, Hao Xiong, Guangxing Chen, Wenlin Wang, Guobao Yang, Zhiwei Zha, Daxiang Dong(+2 more)

Figure 1 for JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

Figure 2 for JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

Figure 3 for JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

Figure 4 for JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

Abstract:In modern internet industries, deep learning based recommender systems have became an indispensable building block for a wide spectrum of applications, such as search engine, news feed, and short video clips. However, it remains challenging to carry the well-trained deep models for online real-time inference serving, with respect to the time-varying web-scale traffics from billions of users, in a cost-effective manner. In this work, we present JIZHI - a Model-as-a-Service system - that per second handles hundreds of millions of online inference requests to huge deep models with more than trillions of sparse parameters, for over twenty real-time recommendation services at Baidu, Inc. In JIZHI, the inference workflow of every recommendation request is transformed to a Staged Event-Driven Pipeline (SEDP), where each node in the pipeline refers to a staged computation or I/O intensive task processor. With traffics of real-time inference requests arrived, each modularized processor can be run in a fully asynchronized way and managed separately. Besides, JIZHI introduces heterogeneous and hierarchical storage to further accelerate the online inference process by reducing unnecessary computations and potential data access latency induced by ultra-sparse model parameters. Moreover, an intelligent resource manager has been deployed to maximize the throughput of JIZHI over the shared infrastructure by searching the optimal resource allocation plan from historical logs and fine-tuning the load shedding policies over intermediate system feedback. Extensive experiments have been done to demonstrate the advantages of JIZHI from the perspectives of end-to-end service latency, system-wide throughput, and resource consumption. JIZHI has helped Baidu saved more than ten million US dollars in hardware and utility costs while handling 200% more traffics without sacrificing inference efficiency.

* Accepted to SIGKDD 2021 applied data science track

Via

Access Paper or Ask Questions

From Distributed Machine Learning to Federated Learning: A Survey

May 10, 2021

Ji Liu, Jizhou Huang, Yang Zhou, Xuhong Li, Shilei Ji, Haoyi Xiong, Dejing Dou

Figure 1 for From Distributed Machine Learning to Federated Learning: A Survey

Figure 2 for From Distributed Machine Learning to Federated Learning: A Survey

Figure 3 for From Distributed Machine Learning to Federated Learning: A Survey

Figure 4 for From Distributed Machine Learning to Federated Learning: A Survey

Abstract:In recent years, data and computing resources are typically distributed in the devices of end users, various regions or organizations. Because of laws or regulations, the distributed data and computing resources cannot be directly shared among different regions or organizations for machine learning tasks. Federated learning emerges as an efficient approach to exploit distributed data and computing resources, so as to collaboratively train machine learning models, while obeying the laws and regulations and ensuring data security and data privacy. In this paper, we provide a comprehensive survey of existing works for federated learning. We propose a functional architecture of federated learning systems and a taxonomy of related techniques. Furthermore, we present the distributed training, data communication, and security of FL systems. Finally, we analyze their limitations and propose future research directions.

* 31 pages, 8 figures

Via

Access Paper or Ask Questions

SMILE: Self-Distilled MIxup for Efficient Transfer LEarning

Mar 25, 2021

Xingjian Li, Haoyi Xiong, Chengzhong Xu, Dejing Dou

Figure 1 for SMILE: Self-Distilled MIxup for Efficient Transfer LEarning

Figure 2 for SMILE: Self-Distilled MIxup for Efficient Transfer LEarning

Figure 3 for SMILE: Self-Distilled MIxup for Efficient Transfer LEarning

Figure 4 for SMILE: Self-Distilled MIxup for Efficient Transfer LEarning

Abstract:To improve the performance of deep learning, mixup has been proposed to force the neural networks favoring simple linear behaviors in-between training samples. Performing mixup for transfer learning with pre-trained models however is not that simple, a high capacity pre-trained model with a large fully-connected (FC) layer could easily overfit to the target dataset even with samples-to-labels mixed up. In this work, we propose SMILE - Self-Distilled Mixup for EffIcient Transfer LEarning. With mixed images as inputs, SMILE regularizes the outputs of CNN feature extractors to learn from the mixed feature vectors of inputs (sample-to-feature mixup), in addition to the mixed labels. Specifically, SMILE incorporates a mean teacher, inherited from the pre-trained model, to provide the feature vectors of input samples in a self-distilling fashion, and mixes up the feature vectors accordingly via a novel triplet regularizer. The triple regularizer balances the mixup effects in both feature and label spaces while bounding the linearity in-between samples for pre-training tasks. Extensive experiments have been done to verify the performance improvement made by SMILE, in comparisons with a wide spectrum of transfer learning algorithms, including fine-tuning, L2-SP, DELTA, and RIFLE, even with mixup strategies combined. Ablation studies show that the vanilla sample-to-label mixup strategies could marginally increase the linearity in-between training samples but lack of generalizability, while SMILE significantly improve the mixup effects in both label and feature spaces with both training and testing datasets. The empirical observations backup our design intuition and purposes.

Via

Access Paper or Ask Questions

Interpretable Deep Learning: Interpretations, Interpretability, Trustworthiness, and Beyond

Mar 19, 2021

Xuhong Li, Haoyi Xiong, Xingjian Li, Xuanyu Wu, Xiao Zhang, Ji Liu, Jiang Bian, Dejing Dou

Figure 1 for Interpretable Deep Learning: Interpretations, Interpretability, Trustworthiness, and Beyond

Figure 2 for Interpretable Deep Learning: Interpretations, Interpretability, Trustworthiness, and Beyond

Figure 3 for Interpretable Deep Learning: Interpretations, Interpretability, Trustworthiness, and Beyond

Figure 4 for Interpretable Deep Learning: Interpretations, Interpretability, Trustworthiness, and Beyond

Abstract:Deep neural networks have been well-known for their superb performance in handling various machine learning and artificial intelligence tasks. However, due to their over-parameterized black-box nature, it is often difficult to understand the prediction results of deep models. In recent years, many interpretation tools have been proposed to explain or reveal the ways that deep models make decisions. In this paper, we review this line of research and try to make a comprehensive survey. Specifically, we introduce and clarify two basic concepts-interpretations and interpretability-that people usually get confused. First of all, to address the research efforts in interpretations, we elaborate the design of several recent interpretation algorithms, from different perspectives, through proposing a new taxonomy. Then, to understand the results of interpretation, we also survey the performance metrics for evaluating interpretation algorithms. Further, we summarize the existing work in evaluating models' interpretability using "trustworthy" interpretation algorithms. Finally, we review and discuss the connections between deep models' interpretations and other factors, such as adversarial robustness and data augmentations, and we introduce several open-source libraries for interpretation algorithms and evaluation approaches.

Via

Access Paper or Ask Questions