Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xin Fan

Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

Feb 25, 2024

Zhiying Jiang, Xingyuan Li, Jinyuan Liu, Xin Fan, Risheng Liu

Figure 1 for Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

Figure 2 for Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

Figure 3 for Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

Figure 4 for Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

Abstract:Image stitching seamlessly integrates images captured from varying perspectives into a single wide field-of-view image. Such integration not only broadens the captured scene but also augments holistic perception in computer vision applications. Given a pair of captured images, subtle perturbations and distortions which go unnoticed by the human visual system tend to attack the correspondence matching, impairing the performance of image stitching algorithms. In light of this challenge, this paper presents the first attempt to improve the robustness of image stitching against adversarial attacks. Specifically, we introduce a stitching-oriented attack~(SoA), tailored to amplify the alignment loss within overlapping regions, thereby targeting the feature matching procedure. To establish an attack resistant model, we delve into the robustness of stitching architecture and develop an adaptive adversarial training~(AAT) to balance attack resistance with stitching precision. In this way, we relieve the gap between the routine adversarial training and benign models, ensuring resilience without quality compromise. Comprehensive evaluation across real-world and synthetic datasets validate the deterioration of SoA on stitching performance. Furthermore, AAT emerges as a more robust solution against adversarial perturbations, delivering superior stitching results. Code is available at:https://github.com/Jzy2017/TRIS.

* Accepted by AAAI2024

Via

Access Paper or Ask Questions

From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

Dec 31, 2023

Xingyuan Li, Yang Zou, Jinyuan Liu, Zhiying Jiang, Long Ma, Xin Fan, Risheng Liu

Figure 1 for From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

Figure 2 for From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

Figure 3 for From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

Figure 4 for From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

Abstract:With the rapid progression of deep learning technologies, multi-modality image fusion has become increasingly prevalent in object detection tasks. Despite its popularity, the inherent disparities in how different sources depict scene content make fusion a challenging problem. Current fusion methodologies identify shared characteristics between the two modalities and integrate them within this shared domain using either iterative optimization or deep learning architectures, which often neglect the intricate semantic relationships between modalities, resulting in a superficial understanding of inter-modal connections and, consequently, suboptimal fusion outcomes. To address this, we introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images. This method capitalizes on the complementary characteristics of diverse modalities, bolstering both the accuracy and robustness of object detection. The codebook is utilized to enhance a streamlined and concise depiction of the fused intra- and inter-domain dynamics, fine-tuned for optimal performance in detection tasks. We present a bilevel optimization strategy that establishes a nexus between the joint problem of fusion and detection, optimizing both processes concurrently. Furthermore, we introduce the first dataset of paired infrared and visible images accompanied by text prompts, paving the way for future research. Extensive experiments on several datasets demonstrate that our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.

* 10 pages, 12 figures, 3 tables, conference

Via

Access Paper or Ask Questions

Learn from the Past: A Proxy based Adversarial Defense Framework to Boost Robustness

Oct 19, 2023

Yaohua Liu, Jiaxin Gao, Zhu Liu, Xianghao Jiao, Xin Fan, Risheng Liu

Figure 1 for Learn from the Past: A Proxy based Adversarial Defense Framework to Boost Robustness

Figure 2 for Learn from the Past: A Proxy based Adversarial Defense Framework to Boost Robustness

Figure 3 for Learn from the Past: A Proxy based Adversarial Defense Framework to Boost Robustness

Figure 4 for Learn from the Past: A Proxy based Adversarial Defense Framework to Boost Robustness

Abstract:In light of the vulnerability of deep learning models to adversarial samples and the ensuing security issues, a range of methods, including Adversarial Training (AT) as a prominent representative, aimed at enhancing model robustness against various adversarial attacks, have seen rapid development. However, existing methods essentially assist the current state of target model to defend against parameter-oriented adversarial attacks with explicit or implicit computation burdens, which also suffers from unstable convergence behavior due to inconsistency of optimization trajectories. Diverging from previous work, this paper reconsiders the update rule of target model and corresponding deficiency to defend based on its current state. By introducing the historical state of the target model as a proxy, which is endowed with much prior information for defense, we formulate a two-stage update rule, resulting in a general adversarial defense framework, which we refer to as `LAST' ({\bf L}earn from the P{\bf ast}). Besides, we devise a Self Distillation (SD) based defense objective to constrain the update process of the proxy model without the introduction of larger teacher models. Experimentally, we demonstrate consistent and significant performance enhancements by refining a series of single-step and multi-step AT methods (e.g., up to $\bf 9.2\%$ and $\bf 20.5\%$ improvement of Robust Accuracy (RA) on CIFAR10 and CIFAR100 datasets, respectively) across various datasets, backbones and attack modalities, and validate its ability to enhance training stability and ameliorate catastrophic overfitting issues meanwhile.

* 16 Pages

Via

Access Paper or Ask Questions

Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments

Sep 11, 2023

Jiaxin Gao, Ziyu Yue, Yaohua Liu, Sihan Xie, Xin Fan, Risheng Liu

Figure 1 for Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments

Figure 2 for Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments

Figure 3 for Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments

Figure 4 for Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments

Abstract:Super-resolution tasks oriented to images captured in ultra-dark environments is a practical yet challenging problem that has received little attention. Due to uneven illumination and low signal-to-noise ratio in dark environments, a multitude of problems such as lack of detail and color distortion may be magnified in the super-resolution process compared to normal-lighting environments. Consequently, conventional low-light enhancement or super-resolution methods, whether applied individually or in a cascaded manner for such problem, often encounter limitations in recovering luminance, color fidelity, and intricate details. To conquer these issues, this paper proposes a specialized dual-modulated learning framework that, for the first time, attempts to deeply dissect the nature of the low-light super-resolution task. Leveraging natural image color characteristics, we introduce a self-regularized luminance constraint as a prior for addressing uneven lighting. Expanding on this, we develop Illuminance-Semantic Dual Modulation (ISDM) components to enhance feature-level preservation of illumination and color details. Besides, instead of deploying naive up-sampling strategies, we design the Resolution-Sensitive Merging Up-sampler (RSMU) module that brings together different sampling modalities as substrates, effectively mitigating the presence of artifacts and halos. Comprehensive experiments showcases the applicability and generalizability of our approach to diverse and challenging ultra-low-light conditions, outperforming state-of-the-art methods with a notable improvement (i.e., $\uparrow$5\% in PSNR, and $\uparrow$43\% in LPIPS). Especially noteworthy is the 19-fold increase in the RMSE score, underscoring our method's exceptional generalization across different darkness levels. The code will be available online upon publication of the paper.

* 9 pages

Via

Access Paper or Ask Questions

Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation

Sep 07, 2023

Xiaohan Cui, Long Ma, Tengyu Ma, Jinyuan Liu, Xin Fan, Risheng Liu

Abstract:Object detection in low-light scenarios has attracted much attention in the past few years. A mainstream and representative scheme introduces enhancers as the pre-processing for regular detectors. However, because of the disparity in task objectives between the enhancer and detector, this paradigm cannot shine at its best ability. In this work, we try to arouse the potential of enhancer + detector. Different from existing works, we extend the illumination-based enhancers (our newly designed or existing) as a scene decomposition module, whose removed illumination is exploited as the auxiliary in the detector for extracting detection-friendly features. A semantic aggregation module is further established for integrating multi-scale scene-related semantic information in the context space. Actually, our built scheme successfully transforms the "trash" (i.e., the ignored illumination in the detector) into the "treasure" for the detector. Plenty of experiments are conducted to reveal our superiority against other state-of-the-art methods. The code will be public if it is accepted.

Via

Access Paper or Ask Questions

Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Sep 03, 2023

Guanyao Wu, Hongming Fu, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Figure 2 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Figure 3 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Figure 4 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Abstract:Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels. Despite its advancements, the field grapples with challenges, notably the reliance on manual designs for network structures and loss functions, and the constraints of utilizing simulated reference images as ground truths. Consequently, current methodologies often suffer from color distortions and exposure artifacts, further complicating the quest for authentic image representation. In addressing these challenges, this paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions. More specifically, we harnesses a unique dual research mechanism rooted in a novel weighted structure refinement architecture search. Besides, a hybrid supervised contrast constraint seamlessly guides and integrates with searching process, facilitating a more adaptive and comprehensive search for optimal loss functions. We realize the state-of-the-art performance in comparison to various competitive schemes, yielding a 10.61% and 4.38% improvement in Visual Information Fidelity (VIF) for general and no-reference scenarios, respectively, while providing results with high contrast, rich details and colors.

Via

Access Paper or Ask Questions

Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

Sep 03, 2023

Zhu Liu, Zihang Chen, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

Figure 2 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

Figure 3 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

Figure 4 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

Abstract:The detection of small infrared targets against blurred and cluttered backgrounds has remained an enduring challenge. In recent years, learning-based schemes have become the mainstream methodology to establish the mapping directly. However, these methods are susceptible to the inherent complexities of changing backgrounds and real-world disturbances, leading to unreliable and compromised target estimations. In this work, we propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions. We first propose a bi-level optimization formulation to introduce dynamic adversarial learning. Specifically, it is composited by the learnable generation of corruptions to maximize the losses as the lower-level objective and the robustness promotion of detectors as the upper-level one. We also provide a hierarchical reinforced learning strategy to discover the most detrimental corruptions and balance the performance between robustness and accuracy. To better disentangle the corruptions from salient features, we also propose a spatial-frequency interaction network for target detection. Extensive experiments demonstrate our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark. The source codes are available at https://github.com/LiuZhu-CV/BALISTD.

* 9 pages, 6 figures

Via

Access Paper or Ask Questions

AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Sep 03, 2023

Xingyuan Li, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Figure 2 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Figure 3 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Figure 4 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Abstract:Monocular 3D object detection plays a pivotal role in the field of autonomous driving and numerous deep learning-based methods have made significant breakthroughs in this area. Despite the advancements in detection accuracy and efficiency, these models tend to fail when faced with such attacks, rendering them ineffective. Therefore, bolstering the adversarial robustness of 3D detection models has become a crucial issue that demands immediate attention and innovative solutions. To mitigate this issue, we propose a depth-aware robust adversarial training method for monocular 3D object detection, dubbed DART3D. Specifically, we first design an adversarial attack that iteratively degrades the 2D and 3D perception capabilities of 3D object detection models(IDP), serves as the foundation for our subsequent defense mechanism. In response to this attack, we propose an uncertainty-based residual learning method for adversarial training. Our adversarial training approach capitalizes on the inherent uncertainty, enabling the model to significantly improve its robustness against adversarial attacks. We conducted extensive experiments on the KITTI 3D datasets, demonstrating that DART3D surpasses direct adversarial training (the most popular approach) under attacks in 3D object detection $AP_{R40}$ of car category for the Easy, Moderate, and Hard settings, with improvements of 4.415%, 4.112%, and 3.195%, respectively.

Via

Access Paper or Ask Questions

Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Sep 02, 2023

Gehui Li, Jinyuan Liu, Long Ma, Zhiying Jiang, Xin Fan, Risheng Liu

Figure 1 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Figure 2 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Figure 3 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Figure 4 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Abstract:Photographs taken with less-than-ideal exposure settings often display poor visual quality. Since the correction procedures vary significantly, it is difficult for a single neural network to handle all exposure problems. Moreover, the inherent limitations of convolutions, hinder the models ability to restore faithful color or details on extremely over-/under- exposed regions. To overcome these limitations, we propose a Macro-Micro-Hierarchical transformer, which consists of a macro attention to capture long-range dependencies, a micro attention to extract local features, and a hierarchical structure for coarse-to-fine correction. In specific, the complementary macro-micro attention designs enhance locality while allowing global interactions. The hierarchical structure enables the network to correct exposure errors of different scales layer by layer. Furthermore, we propose a contrast constraint and couple it seamlessly in the loss function, where the corrected image is pulled towards the positive sample and pushed away from the dynamically generated negative samples. Thus the remaining color distortion and loss of detail can be removed. We also extend our method as an image enhancer for low-light face recognition and low-light semantic segmentation. Experiments demonstrate that our approach obtains more attractive results than state-of-the-art methods quantitatively and qualitatively.

* Accepted by ACM MM 2023

Via

Access Paper or Ask Questions

Learning Heavily-Degraded Prior for Underwater Object Detection

Aug 24, 2023

Chenping Fu, Xin Fan, Jiewen Xiao, Wanqi Yuan, Risheng Liu, Zhongxuan Luo

Figure 1 for Learning Heavily-Degraded Prior for Underwater Object Detection

Figure 2 for Learning Heavily-Degraded Prior for Underwater Object Detection

Figure 3 for Learning Heavily-Degraded Prior for Underwater Object Detection

Figure 4 for Learning Heavily-Degraded Prior for Underwater Object Detection

Abstract:Underwater object detection suffers from low detection performance because the distance and wavelength dependent imaging process yield evident image quality degradations such as haze-like effects, low visibility, and color distortions. Therefore, we commit to resolving the issue of underwater object detection with compounded environmental degradations. Typical approaches attempt to develop sophisticated deep architecture to generate high-quality images or features. However, these methods are only work for limited ranges because imaging factors are either unstable, too sensitive, or compounded. Unlike these approaches catering for high-quality images or features, this paper seeks transferable prior knowledge from detector-friendly images. The prior guides detectors removing degradations that interfere with detection. It is based on statistical observations that, the heavily degraded regions of detector-friendly (DFUI) and underwater images have evident feature distribution gaps while the lightly degraded regions of them overlap each other. Therefore, we propose a residual feature transference module (RFTM) to learn a mapping between deep representations of the heavily degraded patches of DFUI- and underwater- images, and make the mapping as a heavily degraded prior (HDP) for underwater detection. Since the statistical properties are independent to image content, HDP can be learned without the supervision of semantic labels and plugged into popular CNNbased feature extraction networks to improve their performance on underwater object detection. Without bells and whistles, evaluations on URPC2020 and UODD show that our methods outperform CNN-based detectors by a large margin. Our method with higher speeds and less parameters still performs better than transformer-based detectors. Our code and DFUI dataset can be found in https://github.com/xiaoDetection/Learning-Heavily-Degraed-Prior.

Via

Access Paper or Ask Questions