In the past few decades, to reduce the risk of X-ray in computed tomography (CT), low-dose CT image denoising has attracted extensive attention from researchers, which has become an important research issue in the field of medical images. In recent years, with the rapid development of deep learning technology, many algorithms have emerged to apply convolutional neural networks to this task, achieving promising results. However, there are still some problems such as low denoising efficiency, over-smoothed result, etc. In this paper, we propose the Edge enhancement based Densely connected Convolutional Neural Network (EDCNN). In our network, we design an edge enhancement module using the proposed novel trainable Sobel convolution. Based on this module, we construct a model with dense connections to fuse the extracted edge information and realize end-to-end image denoising. Besides, when training the model, we introduce a compound loss that combines MSE loss and multi-scales perceptual loss to solve the over-smoothed problem and attain a marked improvement in image quality after denoising. Compared with the existing low-dose CT image denoising algorithms, our proposed model has a better performance in preserving details and suppressing noise.
As facial interaction systems are prevalently deployed, security and reliability of these systems become a critical issue, with substantial research efforts devoted. Among them, face anti-spoofing emerges as an important area, whose objective is to identify whether a presented face is live or spoof. Though promising progress has been achieved, existing works still have difficulty in handling complex spoof attacks and generalizing to real-world scenarios. The main reason is that current face anti-spoofing datasets are limited in both quantity and diversity. To overcome these obstacles, we contribute a large-scale face anti-spoofing dataset, CelebA-Spoof, with the following appealing properties: 1) Quantity: CelebA-Spoof comprises of 625,537 pictures of 10,177 subjects, significantly larger than the existing datasets. 2) Diversity: The spoof images are captured from 8 scenes (2 environments * 4 illumination conditions) with more than 10 sensors. 3) Annotation Richness: CelebA-Spoof contains 10 spoof type annotations, as well as the 40 attribute annotations inherited from the original CelebA dataset. Equipped with CelebA-Spoof, we carefully benchmark existing methods in a unified multi-task framework, Auxiliary Information Embedding Network (AENet), and reveal several valuable observations.
Feature pyramid network (FPN) based models, which fuse the semantics and salient details in a progressive manner, have been proven highly effective in salient object detection. However, it is observed that these models often generate saliency maps with incomplete object structures or unclear object boundaries, due to the \emph{indirect} information propagation among distant layers that makes such fusion structure less effective. In this work, we propose a novel Cross-layer Feature Pyramid Network (CFPN), in which direct cross-layer communication is enabled to improve the progressive fusion in salient object detection. Specifically, the proposed network first aggregates multi-scale features from different layers into feature maps that have access to both the high- and low-level information. Then, it distributes the aggregated features to all the involved layers to gain access to richer context. In this way, the distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information. Extensive experimental results over six widely used salient object detection benchmarks and with three popular backbones clearly demonstrate that CFPN can accurately locate fairly complete salient regions and effectively segment the object boundaries.
Partial Label Learning (PLL) aims to learn from the data where each training instance is associated with a set of candidate labels, among which only one is correct. Most existing methods deal with such problem by either treating each candidate label equally or identifying the ground-truth label iteratively. In this paper, we propose a novel PLL approach called HERA, which simultaneously incorporates the HeterogEneous Loss and the SpaRse and Low-rAnk procedure to estimate the labeling confidence for each instance while training the model. Specifically, the heterogeneous loss integrates the strengths of both the pairwise ranking loss and the pointwise reconstruction loss to provide informative label ranking and reconstruction information for label identification, while the embedded sparse and low-rank scheme constrains the sparsity of ground-truth label matrix and the low rank of noise label matrix to explore the global label relevance among the whole training data for improving the learning model. Extensive experiments on both artificial and real-world data sets demonstrate that our method can achieve superior or comparable performance against the state-of-the-art methods.
As one of the important functions of the intelligent transportation system (ITS), supply-demand prediction for autonomous vehicles provides a decision basis for its control. In this paper, we present two prediction models (i.e. ARLP model and Advanced ARLP model) based on two system environments that only the current day's historical data is available or several days' historical data are available. These two models jointly consider the spatial, temporal, and semantic relations. Spatial dependency is captured with residual network and dimension reduction. Short term temporal dependency is captured with LSTM. Long term temporal dependency and temporal shifting are captured with LSTM and attention mechanism. Semantic dependency is captured with multi-attention mechanism and autocorrelation coefficient method. Extensive experiments show that our frameworks provide more accurate and stable prediction results than the existing methods.
Partial Label Learning (PLL) aims to learn from the data where each training example is associated with a set of candidate labels, among which only one is correct. The key to deal with such problem is to disambiguate the candidate label sets and obtain the correct assignments between instances and their candidate labels. In this paper, we interpret such assignments as instance-to-label matchings, and reformulate the task of PLL as a matching selection problem. To model such problem, we propose a novel Graph Matching based Partial Label Learning (GM-PLL) framework, where Graph Matching (GM) scheme is incorporated owing to its excellent capability of exploiting the instance and label relationship. Meanwhile, since conventional one-to-one GM algorithm does not satisfy the constraint of PLL problem that multiple instances may correspond to the same label, we extend a traditional one-to-one probabilistic matching algorithm to the many-to-one constraint, and make the proposed framework accommodate to the PLL problem. Moreover, we also propose a relaxed matching prediction model, which can improve the prediction accuracy via GM strategy. Extensive experiments on both artificial and real-world data sets demonstrate that the proposed method can achieve superior or comparable performance against the state-of-the-art methods.
Human parsing is attracting increasing research attention. In this work, we aim to push the frontier of human parsing by introducing the problem of multi-human parsing in the wild. Existing works on human parsing mainly tackle single-person scenarios, which deviates from real-world applications where multiple persons are present simultaneously with interaction and occlusion. To address the multi-human parsing problem, we introduce a new multi-human parsing (MHP) dataset and a novel multi-human parsing model named MH-Parser. The MHP dataset contains multiple persons captured in real-world scenes with pixel-level fine-grained semantic annotations in an instance-aware setting. The MH-Parser generates global parsing maps and person instance masks simultaneously in a bottom-up fashion with the help of a new Graph-GAN model. We envision that the MHP dataset will serve as a valuable data resource to develop new multi-human parsing models, and the MH-Parser offers a strong baseline to drive future research for multi-human parsing in the wild.