Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhenxing Niu

Towards Simple and Accurate Human Pose Estimation with Stair Network

Feb 18, 2022

Chenru Jiang, Kaizhu Huang, Shufei Zhang, Jimin Xiao, Zhenxing Niu, Amir Hussain

Figure 1 for Towards Simple and Accurate Human Pose Estimation with Stair Network

Figure 2 for Towards Simple and Accurate Human Pose Estimation with Stair Network

Figure 3 for Towards Simple and Accurate Human Pose Estimation with Stair Network

Figure 4 for Towards Simple and Accurate Human Pose Estimation with Stair Network

Abstract:In this paper, we focus on tackling the precise keypoint coordinates regression task. Most existing approaches adopt complicated networks with a large number of parameters, leading to a heavy model with poor cost-effectiveness in practice. To overcome this limitation, we develop a small yet discrimicative model called STair Network, which can be simply stacked towards an accurate multi-stage pose estimation system. Specifically, to reduce computational cost, STair Network is composed of novel basic feature extraction blocks which focus on promoting feature diversity and obtaining rich local representations with fewer parameters, enabling a satisfactory balance on efficiency and performance. To further improve the performance, we introduce two mechanisms with negligible computational cost, focusing on feature fusion and replenish. We demonstrate the effectiveness of the STair Network on two standard datasets, e.g., 1-stage STair Network achieves a higher accuracy than HRNet by 5.5% on COCO test dataset with 80\% fewer parameters and 68% fewer GFLOPs.

Via

Access Paper or Ask Questions

Adversarial Fine-tuning for Backdoor Defense: Connect Adversarial Examples to Triggered Samples

Feb 13, 2022

Bingxu Mu, Le Wang, Zhenxing Niu

Figure 1 for Adversarial Fine-tuning for Backdoor Defense: Connect Adversarial Examples to Triggered Samples

Figure 2 for Adversarial Fine-tuning for Backdoor Defense: Connect Adversarial Examples to Triggered Samples

Figure 3 for Adversarial Fine-tuning for Backdoor Defense: Connect Adversarial Examples to Triggered Samples

Figure 4 for Adversarial Fine-tuning for Backdoor Defense: Connect Adversarial Examples to Triggered Samples

Abstract:Deep neural networks (DNNs) are known to be vulnerable to backdoor attacks, i.e., a backdoor trigger planted at training time, the infected DNN model would misclassify any testing sample embedded with the trigger as target label. Due to the stealthiness of backdoor attacks, it is hard either to detect or erase the backdoor from infected models. In this paper, we propose a new Adversarial Fine-Tuning (AFT) approach to erase backdoor triggers by leveraging adversarial examples of the infected model. For an infected model, we observe that its adversarial examples have similar behaviors as its triggered samples. Based on such observation, we design the AFT to break the foundation of the backdoor attack (i.e., the strong correlation between a trigger and a target label). We empirically show that, against 5 state-of-the-art backdoor attacks, AFT can effectively erase the backdoor triggers without obvious performance degradation on clean samples, which significantly outperforms existing defense methods.

Via

Access Paper or Ask Questions

Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

Aug 16, 2021

Fang Zheng, Le Wang, Sanping Zhou, Wei Tang, Zhenxing Niu, Nanning Zheng, Gang Hua

Figure 1 for Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

Figure 2 for Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

Figure 3 for Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

Figure 4 for Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

Abstract:Understanding complex social interactions among agents is a key challenge for trajectory prediction. Most existing methods consider the interactions between pairwise traffic agents or in a local area, while the nature of interactions is unlimited, involving an uncertain number of agents and non-local areas simultaneously. Besides, they treat heterogeneous traffic agents the same, namely those among agents of different categories, while neglecting people's diverse reaction patterns toward traffic agents in ifferent categories. To address these problems, we propose a simple yet effective Unlimited Neighborhood Interaction Network (UNIN), which predicts trajectories of heterogeneous agents in multiple categories. Specifically, the proposed unlimited neighborhood interaction module generates the fused-features of all agents involved in an interaction simultaneously, which is adaptive to any number of agents and any range of interaction area. Meanwhile, a hierarchical graph attention module is proposed to obtain category-to-category interaction and agent-to-agent interaction. Finally, parameters of a Gaussian Mixture Model are estimated for generating the future trajectories. Extensive experimental results on benchmark datasets demonstrate a significant performance improvement of our method over the state-of-the-art methods.

* ICCV2021

Via

Access Paper or Ask Questions

Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters

Aug 03, 2021

Bowen Dong, Zitong Huang, Yuelin Guo, Qilong Wang, Zhenxing Niu, Wangmeng Zuo

Figure 1 for Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters

Figure 2 for Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters

Figure 3 for Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters

Figure 4 for Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters

Abstract:Weakly-supervised object detection (WSOD) has emerged as an inspiring recent topic to avoid expensive instance-level object annotations. However, the bounding boxes of most existing WSOD methods are mainly determined by precomputed proposals, thereby being limited in precise object localization. In this paper, we defend the problem setting for improving localization performance by leveraging the bounding box regression knowledge from a well-annotated auxiliary dataset. First, we use the well-annotated auxiliary dataset to explore a series of learnable bounding box adjusters (LBBAs) in a multi-stage training manner, which is class-agnostic. Then, only LBBAs and a weakly-annotated dataset with non-overlapped classes are used for training LBBA-boosted WSOD. As such, our LBBAs are practically more convenient and economical to implement while avoiding the leakage of the auxiliary well-annotated dataset. In particular, we formulate learning bounding box adjusters as a bi-level optimization problem and suggest an EM-like multi-stage training algorithm. Then, a multi-stage scheme is further presented for LBBA-boosted WSOD. Additionally, a masking strategy is adopted to improve proposal classification. Experimental results verify the effectiveness of our method. Our method performs favorably against state-of-the-art WSOD methods and knowledge transfer model with similar problem setting. Code is publicly available at \url{https://github.com/DongSky/lbba_boosted_wsod}.

* ICCV 2021 (poster)

Via

Access Paper or Ask Questions

Structure First Detail Next: Image Inpainting with Pyramid Generator

Jun 16, 2021

Shuyi Qu, Zhenxing Niu, Kaizhu Huang, Jianke Zhu, Matan Protter, Gadi Zimerman, Yinghui Xu

Figure 1 for Structure First Detail Next: Image Inpainting with Pyramid Generator

Figure 2 for Structure First Detail Next: Image Inpainting with Pyramid Generator

Figure 3 for Structure First Detail Next: Image Inpainting with Pyramid Generator

Figure 4 for Structure First Detail Next: Image Inpainting with Pyramid Generator

Abstract:Recent deep generative models have achieved promising performance in image inpainting. However, it is still very challenging for a neural network to generate realistic image details and textures, due to its inherent spectral bias. By our understanding of how artists work, we suggest to adopt a `structure first detail next' workflow for image inpainting. To this end, we propose to build a Pyramid Generator by stacking several sub-generators, where lower-layer sub-generators focus on restoring image structures while the higher-layer sub-generators emphasize image details. Given an input image, it will be gradually restored by going through the entire pyramid in a bottom-up fashion. Particularly, our approach has a learning scheme of progressively increasing hole size, which allows it to restore large-hole images. In addition, our method could fully exploit the benefits of learning with high-resolution images, and hence is suitable for high-resolution image inpainting. Extensive experimental results on benchmark datasets have validated the effectiveness of our approach compared with state-of-the-art methods.

* ICCV'21 under review

Via

Access Paper or Ask Questions

Adversarial Attack and Defense in Deep Ranking

Jun 07, 2021

Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Nanning Zheng, Gang Hua

Figure 1 for Adversarial Attack and Defense in Deep Ranking

Figure 2 for Adversarial Attack and Defense in Deep Ranking

Figure 3 for Adversarial Attack and Defense in Deep Ranking

Figure 4 for Adversarial Attack and Defense in Deep Ranking

Abstract:Deep Neural Network classifiers are vulnerable to adversarial attack, where an imperceptible perturbation could result in misclassification. However, the vulnerability of DNN-based image ranking systems remains under-explored. In this paper, we propose two attacks against deep ranking systems, i.e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations. Specifically, the expected ranking order is first represented as a set of inequalities, and then a triplet-like objective function is designed to obtain the optimal perturbation. Conversely, an anti-collapse triplet defense is proposed to improve the ranking model robustness against all proposed attacks, where the model learns to prevent the positive and negative samples being pulled close to each other by adversarial attack. To comprehensively measure the empirical adversarial robustness of a ranking model with our defense, we propose an empirical robustness score, which involves a set of representative attacks against ranking models. Our adversarial ranking attacks and defenses are evaluated on MNIST, Fashion-MNIST, CUB200-2011, CARS196 and Stanford Online Products datasets. Experimental results demonstrate that a typical deep ranking system can be effectively compromised by our attacks. Nevertheless, our defense can significantly improve the ranking system robustness, and simultaneously mitigate a wide range of attacks.

Via

Access Paper or Ask Questions

Video Imprint

Jun 07, 2021

Zhanning Gao, Le Wang, Nebojsa Jojic, Zhenxing Niu, Nanning Zheng, Gang Hua

Abstract:A new unified video analytics framework (ER3) is proposed for complex event retrieval, recognition and recounting, based on the proposed video imprint representation, which exploits temporal correlations among image features across video frames. With the video imprint representation, it is convenient to reverse map back to both temporal and spatial locations in video frames, allowing for both key frame identification and key areas localization within each frame. In the proposed framework, a dedicated feature alignment module is incorporated for redundancy removal across frames to produce the tensor representation, i.e., the video imprint. Subsequently, the video imprint is individually fed into both a reasoning network and a feature aggregation module, for event recognition/recounting and event retrieval tasks, respectively. Thanks to its attention mechanism inspired by the memory networks used in language modeling, the proposed reasoning network is capable of simultaneous event category recognition and localization of the key pieces of evidence for event recounting. In addition, the latent structure in our reasoning network highlights the areas of the video imprint, which can be directly used for event recounting. With the event retrieval task, the compact video representation aggregated from the video imprint contributes to better retrieval results than existing state-of-the-art methods.

* IEEE transactions on pattern analysis and machine intelligence, 41(12), 3086-3099 (2018)

Via

Access Paper or Ask Questions

Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps

Apr 25, 2021

Dongsheng Wang, Chaohao Xie, Shaohui Liu, Zhenxing Niu, Wangmeng Zuo

Figure 1 for Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps

Figure 2 for Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps

Figure 3 for Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps

Figure 4 for Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps

Abstract:For image inpainting, the convolutional neural networks (CNN) in previous methods often adopt standard convolutional operator, which treats valid pixels and holes indistinguishably. As a result, they are limited in handling irregular holes and tend to produce color-discrepant and blurry inpainting result. Partial convolution (PConv) copes with this issue by conducting masked convolution and feature re-normalization conditioned only on valid pixels, but the mask-updating is handcrafted and independent with image structural information. In this paper, we present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes with several distinct merits. Instead of using a hard 0-1 mask, a learnable attention map module is introduced for learning feature re-normalization and mask-updating in an end-to-end manner. Learnable reverse attention maps are further proposed in the decoder for emphasizing on filling in unknown pixels instead of reconstructing all pixels. Motivated by that the filling-in order is crucial to inpainting results and largely depends on image structures in exemplar-based methods, we further suggest a multi-scale edge completion network to predict coherent edges. Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges and attention maps generated by masks for feature re-normalization.Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness, and performs favorably against the state-of-the-art methods in terms of qualitative metrics and visual quality.

* 16 pages,13 figures

Via

Access Paper or Ask Questions

SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

Apr 04, 2021

Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, Gang Hua

Figure 1 for SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

Figure 2 for SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

Figure 3 for SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

Figure 4 for SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

Abstract:Pedestrian trajectory prediction is a key technology in autopilot, which remains to be very challenging due to complex interactions between pedestrians. However, previous works based on dense undirected interaction suffer from modeling superfluous interactions and neglect of trajectory motion tendency, and thus inevitably result in a considerable deviance from the reality. To cope with these issues, we present a Sparse Graph Convolution Network~(SGCN) for pedestrian trajectory prediction. Specifically, the SGCN explicitly models the sparse directed interaction with a sparse directed spatial graph to capture adaptive interaction pedestrians. Meanwhile, we use a sparse directed temporal graph to model the motion tendency, thus to facilitate the prediction based on the observed direction. Finally, parameters of a bi-Gaussian distribution for trajectory prediction are estimated by fusing the above two sparse graphs. We evaluate our proposed method on the ETH and UCY datasets, and the experimental results show our method outperforms comparative state-of-the-art methods by 9% in Average Displacement Error(ADE) and 13% in Final Displacement Error(FDE). Notably, visualizations indicate that our method can capture adaptive interactions between pedestrians and their effective motion tendencies.

* Accepted by CVPR2021

Via

Access Paper or Ask Questions

Practical Relative Order Attack in Deep Ranking

Mar 20, 2021

Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Yinghui Xu, Nanning Zheng, Gang Hua

Figure 1 for Practical Relative Order Attack in Deep Ranking

Figure 2 for Practical Relative Order Attack in Deep Ranking

Figure 3 for Practical Relative Order Attack in Deep Ranking

Figure 4 for Practical Relative Order Attack in Deep Ranking

Abstract:Recent studies unveil the vulnerabilities of deep ranking models, where an imperceptible perturbation can trigger dramatic changes in the ranking result. While previous attempts focus on manipulating absolute ranks of certain candidates, the possibility of adjusting their relative order remains under-explored. In this paper, we formulate a new adversarial attack against deep ranking systems, i.e., the Order Attack, which covertly alters the relative order among a selected set of candidates according to an attacker-specified permutation, with limited interference to other unrelated candidates. Specifically, it is formulated as a triplet-style loss imposing an inequality chain reflecting the specified permutation. However, direct optimization of such white-box objective is infeasible in a real-world attack scenario due to various black-box limitations. To cope with them, we propose a Short-range Ranking Correlation metric as a surrogate objective for black-box Order Attack to approximate the white-box method. The Order Attack is evaluated on the Fashion-MNIST and Stanford-Online-Products datasets under both white-box and black-box threat models. The black-box attack is also successfully implemented on a major e-commerce platform. Comprehensive experimental evaluations demonstrate the effectiveness of the proposed methods, revealing a new type of ranking model vulnerability.

Via

Access Paper or Ask Questions