Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Di Xie

Single Domain Dynamic Generalization for Iris Presentation Attack Detection

May 22, 2023

Yachun Li, Jingjing Wang, Yuhui Chen, Di Xie, Shiliang Pu

Abstract:Iris presentation attack detection (PAD) has achieved great success under intra-domain settings but easily degrades on unseen domains. Conventional domain generalization methods mitigate the gap by learning domain-invariant features. However, they ignore the discriminative information in the domain-specific features. Moreover, we usually face a more realistic scenario with only one single domain available for training. To tackle the above issues, we propose a Single Domain Dynamic Generalization (SDDG) framework, which simultaneously exploits domain-invariant and domain-specific features on a per-sample basis and learns to generalize to various unseen domains with numerous natural images. Specifically, a dynamic block is designed to adaptively adjust the network with a dynamic adaptor. And an information maximization loss is further combined to increase diversity. The whole network is integrated into the meta-learning paradigm. We generate amplitude perturbed images and cover diverse domains with natural images. Therefore, the network can learn to generalize to the perturbed domains in the meta-test phase. Extensive experiments show the proposed method is effective and outperforms the state-of-the-art on LivDet-Iris 2017 dataset.

* ICASSP 2023 Camera Ready

Via

Access Paper or Ask Questions

Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation

Mar 30, 2023

Hang Du, Xuejun Yan, Jingjing Wang, Di Xie, Shiliang Pu

Figure 1 for Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation

Figure 2 for Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation

Figure 3 for Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation

Figure 4 for Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation

Abstract:Most existing approaches for point cloud normal estimation aim to locally fit a geometric surface and calculate the normal from the fitted surface. Recently, learning-based methods have adopted a routine of predicting point-wise weights to solve the weighted least-squares surface fitting problem. Despite achieving remarkable progress, these methods overlook the approximation error of the fitting problem, resulting in a less accurate fitted surface. In this paper, we first carry out in-depth analysis of the approximation error in the surface fitting problem. Then, in order to bridge the gap between estimated and precise surface normals, we present two basic design principles: 1) applies the $Z$-direction Transform to rotate local patches for a better surface fitting with a lower approximation error; 2) models the error of the normal estimation as a learnable term. We implement these two principles using deep neural networks, and integrate them with the state-of-the-art (SOTA) normal estimation methods in a plug-and-play manner. Extensive experiments verify our approaches bring benefits to point cloud normal estimation and push the frontier of state-of-the-art performance on both synthetic and real-world datasets.

* The first two authors contributed equally to this work. The source code are available at https://github.com/hikvision-research/3DVision. Accepted to CVPR 2023

Via

Access Paper or Ask Questions

1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

Jan 12, 2023

Wei Zhao, Binbin Chen, Weijie Chen, Shicai Yang, Di Xie, Shiliang Pu, Yueting Zhuang

Figure 1 for 1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

Figure 2 for 1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

Figure 3 for 1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

Figure 4 for 1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

Abstract:OOD-CV challenge is an out-of-distribution generalization task. To solve this problem in object detection track, we propose a simple yet effective Generalize-then-Adapt (G&A) framework, which is composed of a two-stage domain generalization part and a one-stage domain adaptation part. The domain generalization part is implemented by a Supervised Model Pretraining stage using source data for model warm-up and a Weakly Semi-Supervised Model Pretraining stage using both source data with box-level label and auxiliary data (ImageNet-1K) with image-level label for performance boosting. The domain adaptation part is implemented as a Source-Free Domain Adaptation paradigm, which only uses the pre-trained model and the unlabeled target data to further optimize in a self-supervised training manner. The proposed G&A framework help us achieve the first place on the object detection leaderboard of the OOD-CV challenge. Code will be released in https://github.com/hikvision-research/OOD-CV.

* Tech Report

Via

Access Paper or Ask Questions

1st Place Solution for ECCV 2022 OOD-CV Challenge Image Classification Track

Jan 12, 2023

Yilu Guo, Xingyue Shi, Weijie Chen, Shicai Yang, Di Xie, Shiliang Pu, Yueting Zhuang

Abstract:OOD-CV challenge is an out-of-distribution generalization task. In this challenge, our core solution can be summarized as that Noisy Label Learning Is A Strong Test-Time Domain Adaptation Optimizer. Briefly speaking, our main pipeline can be divided into two stages, a pre-training stage for domain generalization and a test-time training stage for domain adaptation. We only exploit labeled source data in the pre-training stage and only exploit unlabeled target data in the test-time training stage. In the pre-training stage, we propose a simple yet effective Mask-Level Copy-Paste data augmentation strategy to enhance out-of-distribution generalization ability so as to resist shape, pose, context, texture, occlusion, and weather domain shifts in this challenge. In the test-time training stage, we use the pre-trained model to assign noisy label for the unlabeled target data, and propose a Label-Periodically-Updated DivideMix method for noisy label learning. After integrating Test-Time Augmentation and Model Ensemble strategies, our solution ranks the first place on the Image Classification Leaderboard of the OOD-CV Challenge. Code will be released in https://github.com/hikvision-research/OOD-CV.

* Tech Report

Via

Access Paper or Ask Questions

NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

Dec 30, 2022

Pengwei Yin, Jiawu Dai, Jingjing Wang, Di Xie, Shiliang Pu

Figure 1 for NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

Figure 2 for NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

Figure 3 for NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

Figure 4 for NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

Abstract:Gaze estimation is the fundamental basis for many visual tasks. Yet, the high cost of acquiring gaze datasets with 3D annotations hinders the optimization and application of gaze estimation models. In this work, we propose a novel Head-Eye redirection parametric model based on Neural Radiance Field, which allows dense gaze data generation with view consistency and accurate gaze direction. Moreover, our head-eye redirection parametric model can decouple the face and eyes for separate neural rendering, so it can achieve the purpose of separately controlling the attributes of the face, identity, illumination, and eye gaze direction. Thus diverse 3D-aware gaze datasets could be obtained by manipulating the latent code belonging to different face attributions in an unsupervised manner. Extensive experiments on several benchmarks demonstrate the effectiveness of our method in domain generalization and domain adaptation for gaze estimation tasks.

* 10 pages, 8 figures, submitted to CVPR 2023

Via

Access Paper or Ask Questions

Attention Diversification for Domain Generalization

Oct 09, 2022

Rang Meng, Xianfeng Li, Weijie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Mingli Song, Di Xie, Shiliang Pu

Figure 1 for Attention Diversification for Domain Generalization

Figure 2 for Attention Diversification for Domain Generalization

Figure 3 for Attention Diversification for Domain Generalization

Figure 4 for Attention Diversification for Domain Generalization

Abstract:Convolutional neural networks (CNNs) have demonstrated gratifying results at learning discriminative features. However, when applied to unseen domains, state-of-the-art models are usually prone to errors due to domain shift. After investigating this issue from the perspective of shortcut learning, we find the devils lie in the fact that models trained on different domains merely bias to different domain-specific features yet overlook diverse task-related features. Under this guidance, a novel Attention Diversification framework is proposed, in which Intra-Model and Inter-Model Attention Diversification Regularization are collaborated to reassign appropriate attention to diverse task-related features. Briefly, Intra-Model Attention Diversification Regularization is equipped on the high-level feature maps to achieve in-channel discrimination and cross-channel diversification via forcing different channels to pay their most salient attention to different spatial locations. Besides, Inter-Model Attention Diversification Regularization is proposed to further provide task-related attention diversification and domain-related attention suppression, which is a paradigm of "simulate, divide and assemble": simulate domain shift via exploiting multiple domain-specific models, divide attention maps into task-related and domain-related groups, and assemble them within each group respectively to execute regularization. Extensive experiments and analyses are conducted on various benchmarks to demonstrate that our method achieves state-of-the-art performance over other competing methods. Code is available at https://github.com/hikvision-research/DomainGeneralization.

* European Conference on Computer Vision (ECCV 2022)
* ECCV 2022. Code available at https://github.com/hikvision-research/DomainGeneralization

Via

Access Paper or Ask Questions

FBNet: Feedback Network for Point Cloud Completion

Oct 08, 2022

Xuejun Yan, Hongyu Yan, Jingjing Wang, Hang Du, Zhihong Wu, Di Xie, Shiliang Pu, Li Lu

Figure 1 for FBNet: Feedback Network for Point Cloud Completion

Figure 2 for FBNet: Feedback Network for Point Cloud Completion

Figure 3 for FBNet: Feedback Network for Point Cloud Completion

Figure 4 for FBNet: Feedback Network for Point Cloud Completion

Abstract:The rapid development of point cloud learning has driven point cloud completion into a new era. However, the information flows of most existing completion methods are solely feedforward, and high-level information is rarely reused to improve low-level feature learning. To this end, we propose a novel Feedback Network (FBNet) for point cloud completion, in which present features are efficiently refined by rerouting subsequent fine-grained ones. Firstly, partial inputs are fed to a Hierarchical Graph-based Network (HGNet) to generate coarse shapes. Then, we cascade several Feedback-Aware Completion (FBAC) Blocks and unfold them across time recurrently. Feedback connections between two adjacent time steps exploit fine-grained features to improve present shape generations. The main challenge of building feedback connections is the dimension mismatching between present and subsequent features. To address this, the elaborately designed point Cross Transformer exploits efficient information from feedback features via cross attention strategy and then refines present features with the enhanced feedback features. Quantitative and qualitative experiments on several datasets demonstrate the superiority of proposed FBNet compared to state-of-the-art methods on point completion task.

* The first two authors contributed equally to this work. The source code and model are available at https://github.com/hikvision-research/3DVision/. Accepted to ECCV 2022 as oral presentation

Via

Access Paper or Ask Questions

Point Cloud Upsampling via Cascaded Refinement Network

Oct 08, 2022

Hang Du, Xuejun Yan, Jingjing Wang, Di Xie, Shiliang Pu

Figure 1 for Point Cloud Upsampling via Cascaded Refinement Network

Figure 2 for Point Cloud Upsampling via Cascaded Refinement Network

Figure 3 for Point Cloud Upsampling via Cascaded Refinement Network

Figure 4 for Point Cloud Upsampling via Cascaded Refinement Network

Abstract:Point cloud upsampling focuses on generating a dense, uniform and proximity-to-surface point set. Most previous approaches accomplish these objectives by carefully designing a single-stage network, which makes it still challenging to generate a high-fidelity point distribution. Instead, upsampling point cloud in a coarse-to-fine manner is a decent solution. However, existing coarse-to-fine upsampling methods require extra training strategies, which are complicated and time-consuming during the training. In this paper, we propose a simple yet effective cascaded refinement network, consisting of three generation stages that have the same network architecture but achieve different objectives. Specifically, the first two upsampling stages generate the dense but coarse points progressively, while the last refinement stage further adjust the coarse points to a better position. To mitigate the learning conflicts between multiple stages and decrease the difficulty of regressing new points, we encourage each stage to predict the point offsets with respect to the input shape. In this manner, the proposed cascaded refinement network can be easily optimized without extra learning strategies. Moreover, we design a transformer-based feature extraction module to learn the informative global and local shape context. In inference phase, we can dynamically adjust the model efficiency and effectiveness, depending on the available computational resources. Extensive experiments on both synthetic and real-scanned datasets demonstrate that the proposed approach outperforms the existing state-of-the-art methods.

* The first two authors contributed equally to this work. The code is publicly available at https://github.com/hikvision-research/3DVision. Accepted to ACCV 2022 as oral presentation

Via

Access Paper or Ask Questions

Multi-Scale Wavelet Transformer for Face Forgery Detection

Oct 08, 2022

Jie Liu, Jingjing Wang, Peng Zhang, Chunmao Wang, Di Xie, Shiliang Pu

Figure 1 for Multi-Scale Wavelet Transformer for Face Forgery Detection

Figure 2 for Multi-Scale Wavelet Transformer for Face Forgery Detection

Figure 3 for Multi-Scale Wavelet Transformer for Face Forgery Detection

Figure 4 for Multi-Scale Wavelet Transformer for Face Forgery Detection

Abstract:Currently, many face forgery detection methods aggregate spatial and frequency features to enhance the generalization ability and gain promising performance under the cross-dataset scenario. However, these methods only leverage one level frequency information which limits their expressive ability. To overcome these limitations, we propose a multi-scale wavelet transformer framework for face forgery detection. Specifically, to take full advantage of the multi-scale and multi-frequency wavelet representation, we gradually aggregate the multi-scale wavelet representation at different stages of the backbone network. To better fuse the frequency feature with the spatial features, frequency-based spatial attention is designed to guide the spatial feature extractor to concentrate more on forgery traces. Meanwhile, cross-modality attention is proposed to fuse the frequency features with the spatial features. These two attention modules are calculated through a unified transformer block for efficiency. A wide variety of experiments demonstrate that the proposed method is efficient and effective for both within and cross datasets.

* The first two authors contributed equally to this work. Accepted to ACCV 2022 as oral presentation

Via

Access Paper or Ask Questions

Semi-supervised Ranking for Object Image Blur Assessment

Jul 13, 2022

Qiang Li, Zhaoliang Yao, Jingjing Wang, Ye Tian, Pengju Yang, Di Xie, Shiliang Pu

Figure 1 for Semi-supervised Ranking for Object Image Blur Assessment

Figure 2 for Semi-supervised Ranking for Object Image Blur Assessment

Figure 3 for Semi-supervised Ranking for Object Image Blur Assessment

Figure 4 for Semi-supervised Ranking for Object Image Blur Assessment

Abstract:Assessing the blurriness of an object image is fundamentally important to improve the performance for object recognition and retrieval. The main challenge lies in the lack of abundant images with reliable labels and effective learning strategies. Current datasets are labeled with limited and confused quality levels. To overcome this limitation, we propose to label the rank relationships between pairwise images rather their quality levels, since it is much easier for humans to label, and establish a large-scale realistic face image blur assessment dataset with reliable labels. Based on this dataset, we propose a method to obtain the blur scores only with the pairwise rank labels as supervision. Moreover, to further improve the performance, we propose a self-supervised method based on quadruplet ranking consistency to leverage the unlabeled data more effectively. The supervised and self-supervised methods constitute a final semi-supervised learning framework, which can be trained end-to-end. Experimental results demonstrate the effectiveness of our method.

* The first two authors contributed equally to this work. Dataset is available at https://github.com/yzliangHIK2022/SSRanking-for-Object-BA. Accepted to ICIP 2022

Via

Access Paper or Ask Questions