Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bowen Li

Universal Learned Image Compression With Low Computational Cost

Jun 23, 2022
Bowen Li, Yao Xin, Youneng Bao, Fanyang Meng, Yongsheng Liang, Wen Tan

Figure 1 for Universal Learned Image Compression With Low Computational Cost

Figure 2 for Universal Learned Image Compression With Low Computational Cost

Figure 3 for Universal Learned Image Compression With Low Computational Cost

Figure 4 for Universal Learned Image Compression With Low Computational Cost

Recently, learned image compression methods have developed rapidly and exhibited excellent rate-distortion performance when compared to traditional standards, such as JPEG, JPEG2000 and BPG. However, the learning-based methods suffer from high computational costs, which is not beneficial for deployment on devices with limited resources. To this end, we propose shift-addition parallel modules (SAPMs), including SAPM-E for the encoder and SAPM-D for the decoder, to largely reduce the energy consumption. To be specific, they can be taken as plug-and-play components to upgrade existing CNN-based architectures, where the shift branch is used to extract large-grained features as compared to small-grained features learned by the addition branch. Furthermore, we thoroughly analyze the probability distribution of latent representations and propose to use Laplace Mixture Likelihoods for more accurate entropy estimation. Experimental results demonstrate that the proposed methods can achieve comparable or even better performance on both PSNR and MS-SSIM metrics to that of the convolutional counterpart with an about 2x energy reduction.

* 5 pages

Via

Access Paper or Ask Questions

Siamese Object Tracking for Unmanned Aerial Vehicle: A Review and Comprehensive Analysis

May 09, 2022
Changhong Fu, Kunhan Lu, Guangze Zheng, Junjie Ye, Ziang Cao, Bowen Li

Figure 1 for Siamese Object Tracking for Unmanned Aerial Vehicle: A Review and Comprehensive Analysis

Figure 2 for Siamese Object Tracking for Unmanned Aerial Vehicle: A Review and Comprehensive Analysis

Figure 3 for Siamese Object Tracking for Unmanned Aerial Vehicle: A Review and Comprehensive Analysis

Figure 4 for Siamese Object Tracking for Unmanned Aerial Vehicle: A Review and Comprehensive Analysis

Unmanned aerial vehicle (UAV)-based visual object tracking has enabled a wide range of applications and attracted increasing attention in the field of remote sensing because of its versatility and effectiveness. As a new force in the revolutionary trend of deep learning, Siamese networks shine in visual object tracking with their promising balance of accuracy, robustness, and speed. Thanks to the development of embedded processors and the gradual optimization of deep neural networks, Siamese trackers receive extensive research and realize preliminary combinations with UAVs. However, due to the UAV's limited onboard computational resources and the complex real-world circumstances, aerial tracking with Siamese networks still faces severe obstacles in many aspects. To further explore the deployment of Siamese networks in UAV tracking, this work presents a comprehensive review of leading-edge Siamese trackers, along with an exhaustive UAV-specific analysis based on the evaluation using a typical UAV onboard processor. Then, the onboard tests are conducted to validate the feasibility and efficacy of representative Siamese trackers in real-world UAV deployment. Furthermore, to better promote the development of the tracking community, this work analyzes the limitations of existing Siamese trackers and conducts additional experiments represented by low-illumination evaluations. In the end, prospects for the development of Siamese UAV tracking in the remote sensing field are discussed. The unified framework of leading-edge Siamese trackers, i.e., code library, and the results of their experimental evaluations are available at https://github.com/vision4robotics/SiameseTracking4UAV .

Via

Access Paper or Ask Questions

S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers

Mar 14, 2022
Binyuan Hui, Ruiying Geng, Lihan Wang, Bowen Qin, Bowen Li, Jian Sun, Yongbin Li

Figure 1 for S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers

Figure 2 for S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers

Figure 3 for S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers

Figure 4 for S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers

The task of converting a natural language question into an executable SQL query, known as text-to-SQL, is an important branch of semantic parsing. The state-of-the-art graph-based encoder has been successfully used in this task but does not model the question syntax well. In this paper, we propose S$^2$SQL, injecting Syntax to question-Schema graph encoder for Text-to-SQL parsers, which effectively leverages the syntactic dependency information of questions in text-to-SQL to improve the performance. We also employ the decoupling constraint to induce diverse relational edge embedding, which further improves the network's performance. Experiments on the Spider and robustness setting Spider-Syn demonstrate that the proposed approach outperforms all existing methods when pre-training models are used, resulting in a performance ranks first on the Spider leaderboard.

* Accepted at ACL 2022 Findings

Via

Access Paper or Ask Questions

AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration

Dec 03, 2021
Bowen Li, Chen Wang, Pranay Reddy, Seungchan Kim, Sebastian Scherer

Figure 1 for AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration

Figure 2 for AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration

Figure 3 for AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration

Figure 4 for AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration

Few-shot object detection has rapidly progressed owing to the success of meta-learning strategies. However, the requirement of a fine-tuning stage in existing methods is timeconsuming and significantly hinders their usage in real-time applications such as autonomous exploration of low-power robots. To solve this problem, we present a brand new architecture, AirDet, which is free of fine-tuning by learning class agnostic relation with support images. Specifically, we propose a support-guided cross-scale (SCS) feature fusion network to generate object proposals, a global-local relation network (GLR) for shots aggregation, and a relation-based prototype embedding network (R-PEN) for precise localization. Exhaustive experiments are conducted on COCO and PASCAL VOC datasets, where surprisingly, AirDet achieves comparable or even better results than the exhaustively finetuned methods, reaching up to 40-60% improvements on the baseline. To our excitement, AirDet obtains favorable performance on multi-scale objects, especially the small ones. Furthermore, we present evaluation results on real-world exploration tests from the DARPA Subterranean Challenge, which strongly validate the feasibility of AirDet in robotics. The source code, pre-trained models, along with the real world data for exploration, will be made public.

* 10 pages, 8 figures

Via

Access Paper or Ask Questions

Accurate and Generalizable Quantitative Scoring of Liver Steatosis from Ultrasound Images via Scalable Deep Learning

Oct 12, 2021
Bowen Li, Dar-In Tai, Ke Yan, Yi-Cheng Chen, Shiu-Feng Huang, Tse-Hwa Hsu, Wan-Ting Yu, Jing Xiao, Le Lu, Adam P. Harrison

Figure 1 for Accurate and Generalizable Quantitative Scoring of Liver Steatosis from Ultrasound Images via Scalable Deep Learning

Figure 2 for Accurate and Generalizable Quantitative Scoring of Liver Steatosis from Ultrasound Images via Scalable Deep Learning

Figure 3 for Accurate and Generalizable Quantitative Scoring of Liver Steatosis from Ultrasound Images via Scalable Deep Learning

Figure 4 for Accurate and Generalizable Quantitative Scoring of Liver Steatosis from Ultrasound Images via Scalable Deep Learning

Background & Aims: Hepatic steatosis is a major cause of chronic liver disease. 2D ultrasound is the most widely used non-invasive tool for screening and monitoring, but associated diagnoses are highly subjective. We developed a scalable deep learning (DL) algorithm for quantitative scoring of liver steatosis from 2D ultrasound images. Approach & Results: Using retrospectively collected multi-view ultrasound data from 3,310 patients, 19,513 studies, and 228,075 images, we trained a DL algorithm to diagnose steatosis stages (healthy, mild, moderate, or severe) from ultrasound diagnoses. Performance was validated on two multi-scanner unblinded and blinded (initially to DL developer) histology-proven cohorts (147 and 112 patients) with histopathology fatty cell percentage diagnoses, and a subset with FibroScan diagnoses. We also quantified reliability across scanners and viewpoints. Results were evaluated using Bland-Altman and receiver operating characteristic (ROC) analysis. The DL algorithm demonstrates repeatable measurements with a moderate number of images (3 for each viewpoint) and high agreement across 3 premium ultrasound scanners. High diagnostic performance was observed across all viewpoints: area under the curves of the ROC to classify >=mild, >=moderate, =severe steatosis grades were 0.85, 0.90, and 0.93, respectively. The DL algorithm outperformed or performed at least comparably to FibroScan with statistically significant improvements for all levels on the unblinded histology-proven cohort, and for =severe steatosis on the blinded histology-proven cohort. Conclusions: The DL algorithm provides a reliable quantitative steatosis assessment across view and scanners on two multi-scanner cohorts. Diagnostic performance was high with comparable or better performance than FibroScan.

* Journal paper submission, 45 pages (main body: 28 pages, supplementary material: 17 pages)

Via

Access Paper or Ask Questions

FedIPR: Ownership Verification for Federated Deep Neural Network Models

Sep 27, 2021
Lixin Fan, Bowen Li, Hanlin Gu, Jie Li, Qiang Yang

Figure 1 for FedIPR: Ownership Verification for Federated Deep Neural Network Models

Figure 2 for FedIPR: Ownership Verification for Federated Deep Neural Network Models

Figure 3 for FedIPR: Ownership Verification for Federated Deep Neural Network Models

Figure 4 for FedIPR: Ownership Verification for Federated Deep Neural Network Models

Federated learning models must be protected against plagiarism since these models are built upon valuable training data owned by multiple institutions or people.This paper illustrates a novel federated deep neural network (FedDNN) ownership verification scheme that allows ownership signatures to be embedded and verified to claim legitimate intellectual property rights (IPR) of FedDNN models, in case that models are illegally copied, re-distributed or misused. The effectiveness of embedded ownership signatures is theoretically justified by proved condition sunder which signatures can be embedded and detected by multiple clients with-out disclosing private signatures. Extensive experimental results on CIFAR10,CIFAR100 image datasets demonstrate that varying bit-lengths signatures can be embedded and reliably detected without affecting models classification performances. Signatures are also robust against removal attacks including fine-tuning and pruning.

* 23 pages, 12 figures

Via

Access Paper or Ask Questions

Federated Deep Learning with Bayesian Privacy

Sep 27, 2021
Hanlin Gu, Lixin Fan, Bowen Li, Yan Kang, Yuan Yao, Qiang Yang

Figure 1 for Federated Deep Learning with Bayesian Privacy

Figure 2 for Federated Deep Learning with Bayesian Privacy

Figure 3 for Federated Deep Learning with Bayesian Privacy

Figure 4 for Federated Deep Learning with Bayesian Privacy

Federated learning (FL) aims to protect data privacy by cooperatively learning a model without sharing private data among users. For Federated Learning of Deep Neural Network with billions of model parameters, existing privacy-preserving solutions are unsatisfactory. Homomorphic encryption (HE) based methods provide secure privacy protections but suffer from extremely high computational and communication overheads rendering it almost useless in practice . Deep learning with Differential Privacy (DP) was implemented as a practical learning algorithm at a manageable cost in complexity. However, DP is vulnerable to aggressive Bayesian restoration attacks as disclosed in the literature and demonstrated in experimental results of this work. To address the aforementioned perplexity, we propose a novel Bayesian Privacy (BP) framework which enables Bayesian restoration attacks to be formulated as the probability of reconstructing private data from observed public information. Specifically, the proposed BP framework accurately quantifies privacy loss by Kullback-Leibler (KL) Divergence between the prior distribution about the privacy data and the posterior distribution of restoration private data conditioning on exposed information}. To our best knowledge, this Bayesian Privacy analysis is the first to provides theoretical justification of secure privacy-preserving capabilities against Bayesian restoration attacks. As a concrete use case, we demonstrate that a novel federated deep learning method using private passport layers is able to simultaneously achieve high model performance, privacy-preserving capability and low computational complexity. Theoretical analysis is in accordance with empirical measurements of information leakage extensively experimented with a variety of DNN networks on image classification MNIST, CIFAR10, and CIFAR100 datasets.

Via

Access Paper or Ask Questions

Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

Aug 19, 2021
Bowen Li, Weixia Zhang, Meng Tian, Guangtao Zhai, Xianpei Wang

Figure 1 for Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

Figure 2 for Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

Figure 3 for Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

Figure 4 for Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

Perceptual quality assessment of the videos acquired in the wilds is of vital importance for quality assurance of video services. The inaccessibility of reference videos with pristine quality and the complexity of authentic distortions pose great challenges for this kind of blind video quality assessment (BVQA) task. Although model-based transfer learning is an effective and efficient paradigm for the BVQA task, it remains to be a challenge to explore what and how to bridge the domain shifts for better video representation. In this work, we propose to transfer knowledge from image quality assessment (IQA) databases with authentic distortions and large-scale action recognition with rich motion patterns. We rely on both groups of data to learn the feature extractor. We train the proposed model on the target VQA databases using a mixed list-wise ranking loss function. Extensive experiments on six databases demonstrate that our method performs very competitively under both individual database and mixed database training settings. We also verify the rationality of each component of the proposed method and explore a simple manner for further improvement.

Via

Access Paper or Ask Questions

HiFT: Hierarchical Feature Transformer for Aerial Tracking

Aug 15, 2021
Ziang Cao, Changhong Fu, Junjie Ye, Bowen Li, Yiming Li

Figure 1 for HiFT: Hierarchical Feature Transformer for Aerial Tracking

Figure 2 for HiFT: Hierarchical Feature Transformer for Aerial Tracking

Figure 3 for HiFT: Hierarchical Feature Transformer for Aerial Tracking

Figure 4 for HiFT: Hierarchical Feature Transformer for Aerial Tracking

Most existing Siamese-based tracking methods execute the classification and regression of the target object based on the similarity maps. However, they either employ a single map from the last convolutional layer which degrades the localization accuracy in complex scenarios or separately use multiple maps for decision making, introducing intractable computations for aerial mobile platforms. Thus, in this work, we propose an efficient and effective hierarchical feature transformer (HiFT) for aerial tracking. Hierarchical similarity maps generated by multi-level convolutional layers are fed into the feature transformer to achieve the interactive fusion of spatial (shallow layers) and semantics cues (deep layers). Consequently, not only the global contextual information can be raised, facilitating the target search, but also our end-to-end architecture with the transformer can efficiently learn the interdependencies among multi-level features, thereby discovering a tracking-tailored feature space with strong discriminability. Comprehensive evaluations on four aerial benchmarks have proven the effectiveness of HiFT. Real-world tests on the aerial platform have strongly validated its practicability with a real-time speed. Our code is available at https://github.com/vision4robotics/HiFT.

* 2021 IEEE International Conference on Computer Vision (ICCV)

Via

Access Paper or Ask Questions