Alert button
Picture for Lei Cai

Lei Cai

Alert button

MFMAN-YOLO: A Method for Detecting Pole-like Obstacles in Complex Environment

Jul 24, 2023
Lei Cai, Hao Wang, Congling Zhou, Yongqiang Wang, Boyu Liu

Figure 1 for MFMAN-YOLO: A Method for Detecting Pole-like Obstacles in Complex Environment
Figure 2 for MFMAN-YOLO: A Method for Detecting Pole-like Obstacles in Complex Environment
Figure 3 for MFMAN-YOLO: A Method for Detecting Pole-like Obstacles in Complex Environment
Figure 4 for MFMAN-YOLO: A Method for Detecting Pole-like Obstacles in Complex Environment

In real-world traffic, there are various uncertainties and complexities in road and weather conditions. To solve the problem that the feature information of pole-like obstacles in complex environments is easily lost, resulting in low detection accuracy and low real-time performance, a multi-scale hybrid attention mechanism detection algorithm is proposed in this paper. First, the optimal transport function Monge-Kantorovich (MK) is incorporated not only to solve the problem of overlapping multiple prediction frames with optimal matching but also the MK function can be regularized to prevent model over-fitting; then, the features at different scales are up-sampled separately according to the optimized efficient multi-scale feature pyramid. Finally, the extraction of multi-scale feature space channel information is enhanced in complex environments based on the hybrid attention mechanism, which suppresses the irrelevant complex environment background information and focuses the feature information of pole-like obstacles. Meanwhile, this paper conducts real road test experiments in a variety of complex environments. The experimental results show that the detection precision, recall, and average precision of the method are 94.7%, 93.1%, and 97.4%, respectively, and the detection frame rate is 400 f/s. This research method can detect pole-like obstacles in a complex road environment in real time and accurately, which further promotes innovation and progress in the field of automatic driving.

* 11 pages 
Viaarxiv icon

Multi-scale Attentive Image De-raining Networks via Neural Architecture Search

Jul 02, 2022
Lei Cai, Yuli Fu, Wanliang Huo, Youjun Xiang, Tao Zhu, Ying Zhang, Huanqiang Zeng

Figure 1 for Multi-scale Attentive Image De-raining Networks via Neural Architecture Search
Figure 2 for Multi-scale Attentive Image De-raining Networks via Neural Architecture Search
Figure 3 for Multi-scale Attentive Image De-raining Networks via Neural Architecture Search
Figure 4 for Multi-scale Attentive Image De-raining Networks via Neural Architecture Search

Multi-scale architectures and attention modules have shown effectiveness in many deep learning-based image de-raining methods. However, manually designing and integrating these two components into a neural network requires a bulk of labor and extensive expertise. In this article, a high-performance multi-scale attentive neural architecture search (MANAS) framework is technically developed for image deraining. The proposed method formulates a new multi-scale attention search space with multiple flexible modules that are favorite to the image de-raining task. Under the search space, multi-scale attentive cells are built, which are further used to construct a powerful image de-raining network. The internal multiscale attentive architecture of the de-raining network is searched automatically through a gradient-based search algorithm, which avoids the daunting procedure of the manual design to some extent. Moreover, in order to obtain a robust image de-raining model, a practical and effective multi-to-one training strategy is also presented to allow the de-raining network to get sufficient background information from multiple rainy images with the same background scene, and meanwhile, multiple loss functions including external loss, internal loss, architecture regularization loss, and model complexity loss are jointly optimized to achieve robust de-raining performance and controllable model complexity. Extensive experimental results on both synthetic and realistic rainy images, as well as the down-stream vision applications (i.e., objection detection and segmentation) consistently demonstrate the superiority of our proposed method.

Viaarxiv icon

MoleculeKit: Machine Learning Methods for Molecular Property Prediction and Drug Discovery

Dec 02, 2020
Zhengyang Wang, Meng Liu, Youzhi Luo, Zhao Xu, Yaochen Xie, Limei Wang, Lei Cai, Shuiwang Ji

Figure 1 for MoleculeKit: Machine Learning Methods for Molecular Property Prediction and Drug Discovery
Figure 2 for MoleculeKit: Machine Learning Methods for Molecular Property Prediction and Drug Discovery
Figure 3 for MoleculeKit: Machine Learning Methods for Molecular Property Prediction and Drug Discovery
Figure 4 for MoleculeKit: Machine Learning Methods for Molecular Property Prediction and Drug Discovery

Properties of molecules are indicative of their functions and thus are useful in many applications. As a cost-effective alternative to experimental approaches, computational methods for predicting molecular properties are gaining increasing momentum and success. However, there lacks a comprehensive collection of tools and methods for this task currently. Here we develop the MoleculeKit, a suite of comprehensive machine learning tools spanning different computational models and molecular representations for molecular property prediction and drug discovery. Specifically, MoleculeKit represents molecules as both graphs and sequences. Built on these representations, MoleculeKit includes both deep learning and traditional machine learning methods for graph and sequence data. Noticeably, we propose and develop novel deep models for learning from molecular graphs and sequences. Therefore, MoleculeKit not only serves as a comprehensive tool, but also contributes towards developing novel and advanced graph and sequence learning methodologies. Results on both online and offline antibiotics discovery and molecular property prediction tasks show that MoleculeKit achieves consistent improvements over prior methods.

* Supplementary Material: https://documentcloud.adobe.com/link/track?uri=urn:aaid:scds:US:d0ca85d1-c6f9-428b-ae2b-c3bf3257196d 
Viaarxiv icon

Deep Low-Shot Learning for Biological Image Classification and Visualization from Limited Training Samples

Oct 20, 2020
Lei Cai, Zhengyang Wang, Rob Kulathinal, Sudhir Kumar, Shuiwang Ji

Figure 1 for Deep Low-Shot Learning for Biological Image Classification and Visualization from Limited Training Samples
Figure 2 for Deep Low-Shot Learning for Biological Image Classification and Visualization from Limited Training Samples
Figure 3 for Deep Low-Shot Learning for Biological Image Classification and Visualization from Limited Training Samples
Figure 4 for Deep Low-Shot Learning for Biological Image Classification and Visualization from Limited Training Samples

Predictive modeling is useful but very challenging in biological image analysis due to the high cost of obtaining and labeling training data. For example, in the study of gene interaction and regulation in Drosophila embryogenesis, the analysis is most biologically meaningful when in situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared. However, labeling training data with precise stages is very time-consuming even for evelopmental biologists. Thus, a critical challenge is how to build accurate computational models for precise developmental stage classification from limited training samples. In addition, identification and visualization of developmental landmarks are required to enable biologists to interpret prediction results and calibrate models. To address these challenges, we propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images. Specifically, to enable accurate model training on limited training samples, we formulate the task as a deep low-shot learning problem and develop a novel two-step learning approach, including data-level learning and feature-level learning. We use a deep residual network as our base model and achieve improved performance in the precise stage prediction task of ISH images. Furthermore, the deep model can be interpreted by computing saliency maps, which consist of pixel-wise contributions of an image to its prediction result. In our task, saliency maps are used to assist the identification and visualization of developmental landmarks. Our experimental results show that the proposed model can not only make accurate predictions, but also yield biologically meaningful interpretations. We anticipate our methods to be easily generalizable to other biological image classification tasks with small training datasets.

Viaarxiv icon

Line Graph Neural Networks for Link Prediction

Oct 20, 2020
Lei Cai, Jundong Li, Jie Wang, Shuiwang Ji

Figure 1 for Line Graph Neural Networks for Link Prediction
Figure 2 for Line Graph Neural Networks for Link Prediction
Figure 3 for Line Graph Neural Networks for Link Prediction
Figure 4 for Line Graph Neural Networks for Link Prediction

We consider the graph link prediction task, which is a classic graph analytical problem with many real-world applications. With the advances of deep learning, current link prediction methods commonly compute features from subgraphs centered at two neighboring nodes and use the features to predict the label of the link between these two nodes. In this formalism, a link prediction problem is converted to a graph classification task. In order to extract fixed-size features for classification, graph pooling layers are necessary in the deep learning model, thereby incurring information loss. To overcome this key limitation, we propose to seek a radically different and novel path by making use of the line graphs in graph theory. In particular, each node in a line graph corresponds to a unique edge in the original graph. Therefore, link prediction problems in the original graph can be equivalently solved as a node classification problem in its corresponding line graph, instead of a graph classification task. Experimental results on fourteen datasets from different applications demonstrate that our proposed method consistently outperforms the state-of-the-art methods, while it has fewer parameters and high training efficiency.

Viaarxiv icon

Structural Plan of Indoor Scenes with Personalized Preferences

Aug 05, 2020
Xinhan Di, Pengqian Yu, Hong Zhu, Lei Cai, Qiuyan Sheng, Changyu Sun

Figure 1 for Structural Plan of Indoor Scenes with Personalized Preferences
Figure 2 for Structural Plan of Indoor Scenes with Personalized Preferences
Figure 3 for Structural Plan of Indoor Scenes with Personalized Preferences
Figure 4 for Structural Plan of Indoor Scenes with Personalized Preferences

In this paper, we propose an assistive model that supports professional interior designers to produce industrial interior decoration solutions and to meet the personalized preferences of the property owners. The proposed model is able to automatically produce the layout of objects of a particular indoor scene according to property owners' preferences. In particular, the model consists of the extraction of abstract graph, conditional graph generation, and conditional scene instantiation. We provide an interior layout dataset that contains real-world 11000 designs from professional designers. Our numerical results on the dataset demonstrate the effectiveness of the proposed model compared with the state-of-art methods.

* Accepted by the 8th International Workshop on Assistive Computer Vision and Robotics (ACVR) in Conjunction with ECCV 2020 
Viaarxiv icon

Deep Learning of High-Order Interactions for Protein Interface Prediction

Jul 18, 2020
Yi Liu, Hao Yuan, Lei Cai, Shuiwang Ji

Figure 1 for Deep Learning of High-Order Interactions for Protein Interface Prediction
Figure 2 for Deep Learning of High-Order Interactions for Protein Interface Prediction
Figure 3 for Deep Learning of High-Order Interactions for Protein Interface Prediction
Figure 4 for Deep Learning of High-Order Interactions for Protein Interface Prediction

Protein interactions are important in a broad range of biological processes. Traditionally, computational methods have been developed to automatically predict protein interface from hand-crafted features. Recent approaches employ deep neural networks and predict the interaction of each amino acid pair independently. However, these methods do not incorporate the important sequential information from amino acid chains and the high-order pairwise interactions. Intuitively, the prediction of an amino acid pair should depend on both their features and the information of other amino acid pairs. In this work, we propose to formulate the protein interface prediction as a 2D dense prediction problem. In addition, we propose a novel deep model to incorporate the sequential information and high-order pairwise interactions to perform interface predictions. We represent proteins as graphs and employ graph neural networks to learn node features. Then we propose the sequential modeling method to incorporate the sequential information and reorder the feature matrix. Next, we incorporate high-order pairwise interactions to generate a 3D tensor containing different pairwise interactions. Finally, we employ convolutional neural networks to perform 2D dense predictions. Experimental results on multiple benchmarks demonstrate that our proposed method can consistently improve the protein interface prediction performance.

* 10 pages, 3 figures, 4 tables. KDD2020 
Viaarxiv icon

Adversarial Model for Rotated Indoor Scenes Planning

Jul 07, 2020
Xinhan Di, Pengqian Yu, Hong Zhu, Lei Cai, Qiuyan Sheng, Changyu Sun

Figure 1 for Adversarial Model for Rotated Indoor Scenes Planning
Figure 2 for Adversarial Model for Rotated Indoor Scenes Planning
Figure 3 for Adversarial Model for Rotated Indoor Scenes Planning
Figure 4 for Adversarial Model for Rotated Indoor Scenes Planning

In this paper, we propose an adversarial model for producing furniture layout for interior scene synthesis when the interior room is rotated. The proposed model combines a conditional adversarial network, a rotation module, a mode module, and a rotation discriminator module. As compared with the prior work on scene synthesis, our proposed three modules enhance the ability of auto-layout generation and reduce the mode collapse during the rotation of the interior room. We conduct our experiments on a proposed real-world interior layout dataset that contains 14400 designs from the professional designers. Our numerical results demonstrate that the proposed model yields higher-quality layouts for four types of rooms, including the bedroom, the bathroom, the study room, and the tatami room.

Viaarxiv icon

Towards Adversarial Planning for Indoor Scenes with Rotation

Jun 24, 2020
Xinhan Di, Pengqian Yu, Hong Zhu, Lei Cai, Qiuyan Sheng, Changyu Sun

Figure 1 for Towards Adversarial Planning for Indoor Scenes with Rotation
Figure 2 for Towards Adversarial Planning for Indoor Scenes with Rotation
Figure 3 for Towards Adversarial Planning for Indoor Scenes with Rotation
Figure 4 for Towards Adversarial Planning for Indoor Scenes with Rotation

In this paper, we propose an adversarial model for producing furniture layout for interior scene synthesis when the interior room is rotated. The proposed model combines a conditional adversarial network, a rotation module, a mode module, and a rotation discriminator module. As compared with the prior work on scene synthesis, our proposed three modules enhance the ability of auto-layout generation and reduce the mode collapse during the rotation of the interior room. We provide an interior layout dataset that contains $14400$ designs from the professional designers with rotation. In our experiments, we compare the quality of the layouts with two baselines. The numerical results demonstrate that the proposed model provides higher-quality layouts for four types of rooms, including the bedroom, the bathroom, the study room, and the tatami room.

* submit to conference 
Viaarxiv icon

Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs

May 25, 2020
Lei Cai, Zhengzhang Chen, Chen Luo, Jiaping Gui, Jingchao Ni, Ding Li, Haifeng Chen

Figure 1 for Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs
Figure 2 for Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs
Figure 3 for Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs
Figure 4 for Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs

Detecting anomalies in dynamic graphs is a vital task, with numerous practical applications in areas such as security, finance, and social media. Previous network embedding based methods have been mostly focusing on learning good node representations, whereas largely ignoring the subgraph structural changes related to the target nodes in dynamic graphs. In this paper, we propose StrGNN, an end-to-end structural temporal Graph Neural Network model for detecting anomalous edges in dynamic graphs. In particular, we first extract the $h$-hop enclosing subgraph centered on the target edge and propose the node labeling function to identify the role of each node in the subgraph. Then, we leverage graph convolution operation and Sortpooling layer to extract the fixed-size feature from each snapshot/timestamp. Based on the extracted features, we utilize Gated recurrent units (GRUs) to capture the temporal information for anomaly detection. Extensive experiments on six benchmark datasets and a real enterprise security system demonstrate the effectiveness of StrGNN.

Viaarxiv icon