Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiangyu He

APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers

Dec 28, 2021

Jiahao Lu, Xi Sheryl Zhang, Tianli Zhao, Xiangyu He, Jian Cheng

Figure 1 for APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers

Figure 2 for APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers

Figure 3 for APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers

Figure 4 for APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers

Abstract:Federated learning frameworks typically require collaborators to share their local gradient updates of a common model instead of sharing training data to preserve privacy. However, prior works on Gradient Leakage Attacks showed that private training data can be revealed from gradients. So far almost all relevant works base their attacks on fully-connected or convolutional neural networks. Given the recent overwhelmingly rising trend of adapting Transformers to solve multifarious vision tasks, it is highly valuable to investigate the privacy risk of vision transformers. In this paper, we analyse the gradient leakage risk of self-attention based mechanism in both theoretical and practical manners. Particularly, we propose APRIL - Attention PRIvacy Leakage, which poses a strong threat to self-attention inspired models such as ViT. Showing how vision Transformers are at the risk of privacy leakage via gradients, we urge the significance of designing privacy-safer Transformer models and defending schemes.

Via

Access Paper or Ask Questions

Improving Binary Neural Networks through Fully Utilizing Latent Weights

Oct 12, 2021

Weixiang Xu, Qiang Chen, Xiangyu He, Peisong Wang, Jian Cheng

Figure 1 for Improving Binary Neural Networks through Fully Utilizing Latent Weights

Figure 2 for Improving Binary Neural Networks through Fully Utilizing Latent Weights

Figure 3 for Improving Binary Neural Networks through Fully Utilizing Latent Weights

Figure 4 for Improving Binary Neural Networks through Fully Utilizing Latent Weights

Abstract:Binary Neural Networks (BNNs) rely on a real-valued auxiliary variable W to help binary training. However, pioneering binary works only use W to accumulate gradient updates during backward propagation, which can not fully exploit its power and may hinder novel advances in BNNs. In this work, we explore the role of W in training besides acting as a latent variable. Notably, we propose to add W into the computation graph, making it perform as a real-valued feature extractor to aid the binary training. We make different attempts on how to utilize the real-valued weights and propose a specialized supervision. Visualization experiments qualitatively verify the effectiveness of our approach in making it easier to distinguish between different categories. Quantitative experiments show that our approach outperforms current state-of-the-arts, further closing the performance gap between floating-point networks and BNNs. Evaluation on ImageNet with ResNet-18 (Top-1 63.4%), ResNet-34 (Top-1 67.0%) achieves new state-of-the-art.

Via

Access Paper or Ask Questions

Architecture Aware Latency Constrained Sparse Neural Networks

Sep 01, 2021

Tianli Zhao, Qinghao Hu, Xiangyu He, Weixiang Xu, Jiaxing Wang, Cong Leng, Jian Cheng

Figure 1 for Architecture Aware Latency Constrained Sparse Neural Networks

Figure 2 for Architecture Aware Latency Constrained Sparse Neural Networks

Figure 3 for Architecture Aware Latency Constrained Sparse Neural Networks

Figure 4 for Architecture Aware Latency Constrained Sparse Neural Networks

Abstract:Acceleration of deep neural networks to meet a specific latency constraint is essential for their deployment on mobile devices. In this paper, we design an architecture aware latency constrained sparse (ALCS) framework to prune and accelerate CNN models. Taking modern mobile computation architectures into consideration, we propose Single Instruction Multiple Data (SIMD)-structured pruning, along with a novel sparse convolution algorithm for efficient computation. Besides, we propose to estimate the run time of sparse models with piece-wise linear interpolation. The whole latency constrained pruning task is formulated as a constrained optimization problem that can be efficiently solved with Alternating Direction Method of Multipliers (ADMM). Extensive experiments show that our system-algorithm co-design framework can achieve much better Pareto frontier among network accuracy and latency on resource-constrained mobile devices.

Via

Access Paper or Ask Questions

Generative Zero-shot Network Quantization

Jan 21, 2021

Xiangyu He, Qinghao Hu, Peisong Wang, Jian Cheng

Figure 1 for Generative Zero-shot Network Quantization

Figure 2 for Generative Zero-shot Network Quantization

Figure 3 for Generative Zero-shot Network Quantization

Figure 4 for Generative Zero-shot Network Quantization

Abstract:Convolutional neural networks are able to learn realistic image priors from numerous training samples in low-level image generation and restoration. We show that, for high-level image recognition tasks, we can further reconstruct "realistic" images of each category by leveraging intrinsic Batch Normalization (BN) statistics without any training data. Inspired by the popular VAE/GAN methods, we regard the zero-shot optimization process of synthetic images as generative modeling to match the distribution of BN statistics. The generated images serve as a calibration set for the following zero-shot network quantizations. Our method meets the needs for quantizing models based on sensitive information, \textit{e.g.,} due to privacy concerns, no data is available. Extensive experiments on benchmark datasets show that, with the help of generated data, our approach consistently outperforms existing data-free quantization methods.

* Technical report

Via

Access Paper or Ask Questions

AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

Sep 15, 2020

Kai Zhang, Martin Danelljan, Yawei Li, Radu Timofte, Jie Liu, Jie Tang, Gangshan Wu, Yu Zhu, Xiangyu He, Wenjie Xu(+68 more)

Figure 1 for AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

Figure 2 for AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

Figure 3 for AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

Figure 4 for AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

Abstract:This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor x4 based on a set of prior examples of low and corresponding high resolution images. The goal is to devise a network that reduces one or several aspects such as runtime, parameter count, FLOPs, activations, and memory consumption while at least maintaining PSNR of MSRResNet. The track had 150 registered participants, and 25 teams submitted the final results. They gauge the state-of-the-art in efficient single image super-resolution.

Via

Access Paper or Ask Questions

SpatialFlow: Bridging All Tasks for Panoptic Segmentation

Dec 02, 2019

Qiang Chen, Anda Cheng, Xiangyu He, Peisong Wang, Jian Cheng

Figure 1 for SpatialFlow: Bridging All Tasks for Panoptic Segmentation

Figure 2 for SpatialFlow: Bridging All Tasks for Panoptic Segmentation

Figure 3 for SpatialFlow: Bridging All Tasks for Panoptic Segmentation

Figure 4 for SpatialFlow: Bridging All Tasks for Panoptic Segmentation

Abstract:Object location is fundamental to panoptic segmentation as it is related to all things and stuff. How to integrate object location in both thing and stuff segmentation is a crucial problem. In this paper, we propose object spatial information flows to achieve this objective. More importantly, we design four parallel sub-networks for sub-tasks in panoptic segmentation, which leads to the preferable adaptation of object spatial information. With sub-networks, the flows can bridge all tasks together by delivering the object's spatial context from the box regression task to others. They can also provide clues for segmenting both things and stuff, which helps the network better understand the whole image. Upon the sub-networks and the flows, we present a location-aware and unified framework for panoptic segmentation, denoted as SpatialFlow. We perform a detailed ablation study on each component and conduct extensive experiments to prove the effectiveness of Our SpatialFlow. Furthermore, we achieve state-of-the-art results, which are $47.3$ PQ and $62.5$ PQ respectively on MS-COCO and Cityscapes panoptic benchmarks.

* 13 pages, 6 figures

Via

Access Paper or Ask Questions

Location-aware Upsampling for Semantic Segmentation

Nov 14, 2019

Xiangyu He, Zitao Mo, Qiang Chen, Anda Cheng, Peisong Wang, Jian Cheng

Figure 1 for Location-aware Upsampling for Semantic Segmentation

Figure 2 for Location-aware Upsampling for Semantic Segmentation

Figure 3 for Location-aware Upsampling for Semantic Segmentation

Figure 4 for Location-aware Upsampling for Semantic Segmentation

Abstract:Many successful learning targets such as minimizing dice loss and cross-entropy loss have enabled unprecedented breakthroughs in segmentation tasks. Beyond these semantic metrics, this paper aims to introduce location supervision into semantic segmentation. Based on this idea, we present a Location-aware Upsampling (LaU) that adaptively refines the interpolating coordinates with trainable offsets. Then, location-aware losses are established by encouraging pixels to move towards well-classified locations. An LaU is offset prediction coupled with interpolation, which is trained end-to-end to generate confidence score at each position from coarse to fine. Guided by location-aware losses, the new module can replace its plain counterpart (\textit{e.g.}, bilinear upsampling) in a plug-and-play manner to further boost the leading encoder-decoder approaches. Extensive experiments validate the consistent improvement over the state-of-the-art methods on benchmark datasets. Our code is available at https://github.com/HolmesShuan/Location-aware-Upsampling-for-Semantic-Segmentation

Via

Access Paper or Ask Questions

A System-Level Solution for Low-Power Object Detection

Oct 19, 2019

Fanrong Li, Zitao Mo, Peisong Wang, Zejian Liu, Jiayun Zhang, Gang Li, Qinghao Hu, Xiangyu He, Cong Leng, Yang Zhang(+1 more)

Figure 1 for A System-Level Solution for Low-Power Object Detection

Figure 2 for A System-Level Solution for Low-Power Object Detection

Figure 3 for A System-Level Solution for Low-Power Object Detection

Figure 4 for A System-Level Solution for Low-Power Object Detection

Abstract:Object detection has made impressive progress in recent years with the help of deep learning. However, state-of-the-art algorithms are both computation and memory intensive. Though many lightweight networks are developed for a trade-off between accuracy and efficiency, it is still a challenge to make it practical on an embedded device. In this paper, we present a system-level solution for efficient object detection on a heterogeneous embedded device. The detection network is quantized to low bits and allows efficient implementation with shift operators. In order to make the most of the benefits of low-bit quantization, we design a dedicated accelerator with programmable logic. Inside the accelerator, a hybrid dataflow is exploited according to the heterogeneous property of different convolutional layers. We adopt a straightforward but resource-friendly column-prior tiling strategy to map the computation-intensive convolutional layers to the accelerator that can support arbitrary feature size. Other operations can be performed on the low-power CPU cores, and the entire system is executed in a pipelined manner. As a case study, we evaluate our object detection system on a real-world surveillance video with input size of 512x512, and it turns out that the system can achieve an inference speed of 18 fps at the cost of 6.9W (with display) with an mAP of 66.4 verified on the PASCAL VOC 2012 dataset.

* Accepted by ICCV 2019 Low-Power Computer Vision Workshop

Via

Access Paper or Ask Questions

Compact Global Descriptor for Neural Networks

Aug 01, 2019

Xiangyu He, Ke Cheng, Qiang Chen, Qinghao Hu, Peisong Wang, Jian Cheng

Figure 1 for Compact Global Descriptor for Neural Networks

Figure 2 for Compact Global Descriptor for Neural Networks

Figure 3 for Compact Global Descriptor for Neural Networks

Figure 4 for Compact Global Descriptor for Neural Networks

Abstract:Long-range dependencies modeling, widely used in capturing spatiotemporal correlation, has shown to be effective in CNN dominated computer vision tasks. Yet neither stacks of convolutional operations to enlarge receptive fields nor recent nonlocal modules is computationally efficient. In this paper, we present a generic family of lightweight global descriptors for modeling the interactions between positions across different dimensions (e.g., channels, frames). This descriptor enables subsequent convolutions to access the informative global features with negligible computational complexity and parameters. Benchmark experiments show that the proposed method can complete state-of-the-art long-range mechanisms with a significant reduction in extra computing cost. Code available at https://github.com/HolmesShuan/Compact-Global-Descriptor.

Via

Access Paper or Ask Questions