Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Liu

Real-time Universal Style Transfer on High-resolution Images via Zero-channel Pruning

Jun 23, 2020
Jie An, Tao Li, Haozhi Huang, Li Shen, Xuan Wang, Yongyi Tang, Jinwen Ma, Wei Liu, Jiebo Luo

Figure 1 for Real-time Universal Style Transfer on High-resolution Images via Zero-channel Pruning

Figure 2 for Real-time Universal Style Transfer on High-resolution Images via Zero-channel Pruning

Figure 3 for Real-time Universal Style Transfer on High-resolution Images via Zero-channel Pruning

Figure 4 for Real-time Universal Style Transfer on High-resolution Images via Zero-channel Pruning

Extracting effective deep features to represent content and style information is the key to universal style transfer. Most existing algorithms use VGG19 as the feature extractor, which incurs a high computational cost and impedes real-time style transfer on high-resolution images. In this work, we propose a lightweight alternative architecture - ArtNet, which is based on GoogLeNet, and later pruned by a novel channel pruning method named Zero-channel Pruning specially designed for style transfer approaches. Besides, we propose a theoretically sound sandwich swap transform (S2) module to transfer deep features, which can create a pleasing holistic appearance and good local textures with an improved content preservation ability. By using ArtNet and S2, our method is 2.3 to 107.4 times faster than state-of-the-art approaches. The comprehensive experiments demonstrate that ArtNet can achieve universal, real-time, and high-quality style transfer on high-resolution images simultaneously, (68.03 FPS on 512 times 512 images).

Via

Access Paper or Ask Questions

AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks

Jun 16, 2020
Yuesong Tian, Li Shen, Li Shen, Guinan Su, Zhifeng Li, Wei Liu

Figure 1 for AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks

Figure 2 for AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks

Figure 3 for AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks

Figure 4 for AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks

Generative Adversarial Networks (GANs) are formulated as minimax game problems, whereby generators attempt to approach real data distributions by virtue of adversarial learning against discriminators. The intrinsic problem complexity poses the challenge to enhance the performance of generative networks. In this work, we aim to boost model learning from the perspective of network architectures, by incorporating recent progress on automated architecture search into GANs. To this end, we propose a fully differentiable search framework for generative adversarial networks, dubbed alphaGAN. The searching process is formalized as solving a bi-level minimax optimization problem, in which the outer-level objective aims for seeking a suitable network architecture towards pure Nash Equilibrium conditioned on the generator and the discriminator network parameters optimized with a traditional GAN loss in the inner level. The entire optimization performs a first-order method by alternately minimizing the two-level objective in a fully differentiable manner, enabling architecture search to be completed in an enormous search space. Extensive experiments on CIFAR-10 and STL-10 datasets show that our algorithm can obtain high-performing architectures only with 3-GPU hours on a single GPU in the search space comprised of approximate 2 ? 1011 possible configurations. We also provide a comprehensive analysis on the behavior of the searching process and the properties of searched architectures, which would benefit further research on architectures for generative models. Pretrained models and codes are available at https://github.com/yuesongtian/AlphaGAN.

Via

Access Paper or Ask Questions

GANgster: A Fraud Review Detector based on Regulated GAN with Data Augmentation

Jun 11, 2020
Saeedreza Shehnepoor, Roberto Togneri, Wei Liu, Mohammed Bennamoun

Figure 1 for GANgster: A Fraud Review Detector based on Regulated GAN with Data Augmentation

Figure 2 for GANgster: A Fraud Review Detector based on Regulated GAN with Data Augmentation

Figure 3 for GANgster: A Fraud Review Detector based on Regulated GAN with Data Augmentation

Figure 4 for GANgster: A Fraud Review Detector based on Regulated GAN with Data Augmentation

Financial implications of written reviews provide great incentives for businesses to pay fraudsters to write or use bots to generate fraud reviews. The promising performance of Deep Neural Networks (DNNs) in text classification, has attracted research to use them for fraud review detection. However, the lack of trusted labeled data has limited the performance of the current solutions in detecting fraud reviews. Unsupervised and semi-supervised methods are among the most applicable methods to deal with the data scarcity problem. Generative Adversarial Network (GAN) as a semi-supervised method has demonstrated to be effective for data augmentation purposes. The state-of-the-art solution utilizes GAN to overcome the data limitation problem. However, it fails to incorporate the behavioral clues in both fraud generation and detection. Besides, the state-of-the-art approach suffers from a common limitation in the training convergence of the GAN, slowing down the training procedure. In this work, we propose a regularised GAN for fraud review detection that makes use of both review text and review rating scores. Scores are incorporated through Information Gain Maximization in to the loss function for two reasons. One is to generate near-authentic and more human like score-correlated reviews. The other is to improve the stability of the GAN. Experimental results have shown better convergence of the regulated GAN. In addition, the scores are also used in combination with word embeddings of review text as input for the discriminators for better performance. Results show that the proposed framework relatively outperformed existing state-of-the-art framework; namely FakeGAN; in terms of AP by 7%, and 5% on the Yelp and TripAdvisor datasets, respectively.

Via

Access Paper or Ask Questions

DFraud3- Multi-Component Fraud Detection freeof Cold-start

Jun 11, 2020
Saeedreza Shehnepoor, Roberto Togneri, Wei Liu, Mohammed Bennamoun

Figure 1 for DFraud3- Multi-Component Fraud Detection freeof Cold-start

Figure 2 for DFraud3- Multi-Component Fraud Detection freeof Cold-start

Figure 3 for DFraud3- Multi-Component Fraud Detection freeof Cold-start

Figure 4 for DFraud3- Multi-Component Fraud Detection freeof Cold-start

Fraud review detection is a hot research topic inrecent years. The Cold-start is a particularly new but significant problem referring to the failure of a detection system to recognize the authenticity of a new user. State-of-the-art solutions employ a translational knowledge graph embedding approach (TransE) to model the interaction of the components of a review system. However, these approaches suffer from the limitation of TransEin handling N-1 relations and the narrow scope of a single classification task, i.e., detecting fraudsters only. In this paper, we model a review system as a Heterogeneous InformationNetwork (HIN) which enables a unique representation to every component and performs graph inductive learning on the review data through aggregating features of nearby nodes. HIN with graph induction helps to address the camouflage issue (fraudsterswith genuine reviews) which has shown to be more severe when it is coupled with cold-start, i.e., new fraudsters with genuine first reviews. In this research, instead of focusing only on one component, detecting either fraud reviews or fraud users (fraudsters), vector representations are learnt for each component, enabling multi-component classification. In other words, we are able to detect fraud reviews, fraudsters, and fraud-targeted items, thus the name of our approach DFraud3. DFraud3 demonstrates a significant accuracy increase of 13% over the state of the art on Yelp.

Via

Access Paper or Ask Questions

TCDesc: Learning Topology Consistent Descriptors

Jun 05, 2020
Honghu Pan, Fanyang Meng, Zhenyu He, Yongsheng Liang, Wei Liu

Figure 1 for TCDesc: Learning Topology Consistent Descriptors

Figure 2 for TCDesc: Learning Topology Consistent Descriptors

Figure 3 for TCDesc: Learning Topology Consistent Descriptors

Figure 4 for TCDesc: Learning Topology Consistent Descriptors

Triplet loss is widely used for learning local descriptors from image patch. However, triplet loss only minimizes the Euclidean distance between matching descriptors and maximizes that between the non-matching descriptors, which neglects the topology similarity between two descriptor sets. In this paper, we propose topology measure besides Euclidean distance to learn topology consistent descriptors by considering kNN descriptors of positive sample. First we establish a novel topology vector for each descriptor followed by Locally Linear Embedding (LLE) to indicate the topological relation among the descriptor and its kNN descriptors. Then we define topology distance between descriptors as the difference of their topology vectors. Last we employ the dynamic weighting strategy to fuse Euclidean distance and topology distance of matching descriptors and take the fusion result as the positive sample distance in the triplet loss. Experimental results on several benchmarks show that our method performs better than state-of-the-arts results and effectively improves the performance of triplet loss.

Via

Access Paper or Ask Questions

CPOT: Channel Pruning via Optimal Transport

May 21, 2020
Yucong Shen, Li Shen, Hao-Zhi Huang, Xuan Wang, Wei Liu

Figure 1 for CPOT: Channel Pruning via Optimal Transport

Figure 2 for CPOT: Channel Pruning via Optimal Transport

Figure 3 for CPOT: Channel Pruning via Optimal Transport

Figure 4 for CPOT: Channel Pruning via Optimal Transport

Recent advances in deep neural networks (DNNs) lead to tremendously growing network parameters, making the deployments of DNNs on platforms with limited resources extremely difficult. Therefore, various pruning methods have been developed to compress the deep network architectures and accelerate the inference process. Most of the existing channel pruning methods discard the less important filters according to well-designed filter ranking criteria. However, due to the limited interpretability of deep learning models, designing an appropriate ranking criterion to distinguish redundant filters is difficult. To address such a challenging issue, we propose a new technique of Channel Pruning via Optimal Transport, dubbed CPOT. Specifically, we locate the Wasserstein barycenter for channels of each layer in the deep models, which is the mean of a set of probability distributions under the optimal transport metric. Then, we prune the redundant information located by Wasserstein barycenters. At last, we empirically demonstrate that, for classification tasks, CPOT outperforms the state-of-the-art methods on pruning ResNet-20, ResNet-32, ResNet-56, and ResNet-110. Furthermore, we show that the proposed CPOT technique is good at compressing the StarGAN models by pruning in the more difficult case of image-to-image translation tasks.

* 11 pages

Via

Access Paper or Ask Questions

Hierarchical Regression Network for Spectral Reconstruction from RGB Images

May 10, 2020
Yuzhi Zhao, Lai-Man Po, Qiong Yan, Wei Liu, Tingyu Lin

Figure 1 for Hierarchical Regression Network for Spectral Reconstruction from RGB Images

Figure 2 for Hierarchical Regression Network for Spectral Reconstruction from RGB Images

Figure 3 for Hierarchical Regression Network for Spectral Reconstruction from RGB Images

Figure 4 for Hierarchical Regression Network for Spectral Reconstruction from RGB Images

Capturing visual image with a hyperspectral camera has been successfully applied to many areas due to its narrow-band imaging technology. Hyperspectral reconstruction from RGB images denotes a reverse process of hyperspectral imaging by discovering an inverse response function. Current works mainly map RGB images directly to corresponding spectrum but do not consider context information explicitly. Moreover, the use of encoder-decoder pair in current algorithms leads to loss of information. To address these problems, we propose a 4-level Hierarchical Regression Network (HRNet) with PixelShuffle layer as inter-level interaction. Furthermore, we adopt a residual dense block to remove artifacts of real world RGB images and a residual global block to build attention mechanism for enlarging perceptive field. We evaluate proposed HRNet with other architectures and techniques by participating in NTIRE 2020 Challenge on Spectral Reconstruction from RGB Images. The HRNet is the winning method of track 2 - real world images and ranks 3rd on track 1 - clean images. Please visit the project web page https://github.com/zhaoyuzhi/Hierarchical-Regression-Network-for-Spectral-Reconstruction-from-RGB-Images to try our codes and pre-trained models.

* 1st Place in CVPRW 2020 NTIRE Spectral Reconstruction Challenge

Via

Access Paper or Ask Questions

NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results

May 08, 2020
Abdelrahman Abdelhamed, Mahmoud Afifi, Radu Timofte, Michael S. Brown, Yue Cao, Zhilu Zhang, Wangmeng Zuo, Xiaoling Zhang, Jiye Liu, Wendong Chen, Changyuan Wen, Meng Liu, Shuailin Lv, Yunchao Zhang, Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Xiyu Yu, Gang Zhang, Jingtuo Liu, Junyu Han, Errui Ding, Songhyun Yu, Bumjun Park, Jechang Jeong, Shuai Liu, Ziyao Zong, Nan Nan, Chenghua Li, Zengli Yang, Long Bao, Shuangquan Wang, Dongwoon Bai, Jungwon Lee, Youngjung Kim, Kyeongha Rho, Changyeop Shin, Sungho Kim, Pengliang Tang, Yiyun Zhao, Yuqian Zhou, Yuchen Fan, Thomas Huang, Zhihao Li, Nisarg A. Shah, Wei Liu, Qiong Yan, Yuzhi Zhao, Marcin Możejko, Tomasz Latkowski, Lukasz Treszczotko, Michał Szafraniuk, Krzysztof Trojanowski, Yanhong Wu, Pablo Navarrete Michelini, Fengshuo Hu, Yunhua Lu, Sujin Kim, Wonjin Kim, Jaayeon Lee, Jang-Hwan Choi, Magauiya Zhussip, Azamat Khassenov, Jong Hyun Kim, Hwechul Cho, Priya Kansal, Sabari Nathan, Zhangyu Ye, Xiwen Lu, Yaqi Wu, Jiangxin Yang, Yanlong Cao, Siliang Tang, Yanpeng Cao, Matteo Maggioni, Ioannis Marras, Thomas Tanay, Gregory Slabaugh, Youliang Yan, Myungjoo Kang, Han-Soo Choi, Kyungmin Song, Shusong Xu, Xiaomu Lu, Tingniao Wang, Chunxia Lei, Bin Liu, Rajat Gupta, Vineet Kumar

Figure 1 for NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results

Figure 2 for NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results

Figure 3 for NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results

Figure 4 for NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results

This paper reviews the NTIRE 2020 challenge on real image denoising with focus on the newly introduced dataset, the proposed methods and their results. The challenge is a new version of the previous NTIRE 2019 challenge on real image denoising that was based on the SIDD benchmark. This challenge is based on a newly collected validation and testing image datasets, and hence, named SIDD+. This challenge has two tracks for quantitatively evaluating image denoising performance in (1) the Bayer-pattern rawRGB and (2) the standard RGB (sRGB) color spaces. Each track ~250 registered participants. A total of 22 teams, proposing 24 methods, competed in the final phase of the challenge. The proposed methods by the participating teams represent the current state-of-the-art performance in image denoising targeting real noisy images. The newly collected SIDD+ datasets are publicly available at: https://bit.ly/siddplus_data.

Via

Access Paper or Ask Questions

Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation

May 08, 2020
Zhaohui Zheng, Ping Wang, Dongwei Ren, Wei Liu, Rongguang Ye, Qinghua Hu, Wangmeng Zuo

Figure 1 for Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation

Figure 2 for Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation

Figure 3 for Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation

Figure 4 for Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation

Deep learning-based object detection and instance segmentation have achieved unprecedented progress. In this paper, we propose Complete-IoU (CIoU) loss and Cluster-NMS for enhancing geometric factors in both bounding box regression and Non-Maximum Suppression (NMS), leading to notable gains of average precision (AP) and average recall (AR), without the sacrifice of inference efficiency. In particular, we consider three geometric factors, i.e., overlap area, normalized central point distance and aspect ratio, which are crucial for measuring bounding box regression in object detection and instance segmentation. The three geometric factors are then incorporated into CIoU loss for better distinguishing difficult regression cases. The training of deep models using CIoU loss results in consistent AP and AR improvements in comparison to widely adopted $\ell_n$-norm loss and IoU-based loss. Furthermore, we propose Cluster-NMS, where NMS during inference is done by implicitly clustering detected boxes and usually requires less iterations. Cluster-NMS is very efficient due to its pure GPU implementation, and geometric factors can be incorporated to improve both AP and AR. In the experiments, CIoU loss and Cluster-NMS have been applied to state-of-the-art instance segmentation (e.g., YOLACT), and object detection (e.g., YOLO v3, SSD and Faster R-CNN) models. Taking YOLACT on MS COCO as an example, our method achieves performance gains as +1.7 AP and +6.2 AR$_{100}$ for object detection, and +0.9 AP and +3.5 AR$_{100}$ for instance segmentation, with 27.1 FPS on one NVIDIA GTX 1080Ti GPU. All the source code and trained models are available at https://github.com/Zzh-tju/CIoU

* All the source code and trained models are available at https://github.com/Zzh-tju/CIoU arXiv admin note: text overlap with arXiv:1911.08287

Via

Access Paper or Ask Questions