Alert button
Picture for Jinyuan Liu

Jinyuan Liu

Alert button

Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation

Sep 07, 2023
Xiaohan Cui, Long Ma, Tengyu Ma, Jinyuan Liu, Xin Fan, Risheng Liu

Object detection in low-light scenarios has attracted much attention in the past few years. A mainstream and representative scheme introduces enhancers as the pre-processing for regular detectors. However, because of the disparity in task objectives between the enhancer and detector, this paradigm cannot shine at its best ability. In this work, we try to arouse the potential of enhancer + detector. Different from existing works, we extend the illumination-based enhancers (our newly designed or existing) as a scene decomposition module, whose removed illumination is exploited as the auxiliary in the detector for extracting detection-friendly features. A semantic aggregation module is further established for integrating multi-scale scene-related semantic information in the context space. Actually, our built scheme successfully transforms the "trash" (i.e., the ignored illumination in the detector) into the "treasure" for the detector. Plenty of experiments are conducted to reveal our superiority against other state-of-the-art methods. The code will be public if it is accepted.

Viaarxiv icon

Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction

Sep 03, 2023
Xiaoke Shang, Gehui Li, Zhiying Jiang, Shaomin Zhang, Nai Ding, Jinyuan Liu

Figure 1 for Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction
Figure 2 for Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction
Figure 3 for Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction
Figure 4 for Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction

The correction of exposure-related issues is a pivotal component in enhancing the quality of images, offering substantial implications for various computer vision tasks. Historically, most methodologies have predominantly utilized spatial domain recovery, offering limited consideration to the potentialities of the frequency domain. Additionally, there has been a lack of a unified perspective towards low-light enhancement, exposure correction, and multi-exposure fusion, complicating and impeding the optimization of image processing. In response to these challenges, this paper proposes a novel methodology that leverages the frequency domain to improve and unify the handling of exposure correction tasks. Our method introduces Holistic Frequency Attention and Dynamic Frequency Feed-Forward Network, which replace conventional correlation computation in the spatial-domain. They form a foundational building block that facilitates a U-shaped Holistic Dynamic Frequency Transformer as a filter to extract global information and dynamically select important frequency bands for image restoration. Complementing this, we employ a Laplacian pyramid to decompose images into distinct frequency bands, followed by multiple restorers, each tuned to recover specific frequency-band information. The pyramid fusion allows a more detailed and nuanced image restoration process. Ultimately, our structure unifies the three tasks of low-light enhancement, exposure correction, and multi-exposure fusion, enabling comprehensive treatment of all classical exposure errors. Benchmarking on mainstream datasets for these tasks, our proposed method achieves state-of-the-art results, paving the way for more sophisticated and unified solutions in exposure correction.

Viaarxiv icon

Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Sep 03, 2023
Guanyao Wu, Hongming Fu, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion
Figure 2 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion
Figure 3 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion
Figure 4 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels. Despite its advancements, the field grapples with challenges, notably the reliance on manual designs for network structures and loss functions, and the constraints of utilizing simulated reference images as ground truths. Consequently, current methodologies often suffer from color distortions and exposure artifacts, further complicating the quest for authentic image representation. In addressing these challenges, this paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions. More specifically, we harnesses a unique dual research mechanism rooted in a novel weighted structure refinement architecture search. Besides, a hybrid supervised contrast constraint seamlessly guides and integrates with searching process, facilitating a more adaptive and comprehensive search for optimal loss functions. We realize the state-of-the-art performance in comparison to various competitive schemes, yielding a 10.61% and 4.38% improvement in Visual Information Fidelity (VIF) for general and no-reference scenarios, respectively, while providing results with high contrast, rich details and colors.

Viaarxiv icon

AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Sep 03, 2023
Xingyuan Li, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training
Figure 2 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training
Figure 3 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training
Figure 4 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Monocular 3D object detection plays a pivotal role in the field of autonomous driving and numerous deep learning-based methods have made significant breakthroughs in this area. Despite the advancements in detection accuracy and efficiency, these models tend to fail when faced with such attacks, rendering them ineffective. Therefore, bolstering the adversarial robustness of 3D detection models has become a crucial issue that demands immediate attention and innovative solutions. To mitigate this issue, we propose a depth-aware robust adversarial training method for monocular 3D object detection, dubbed DART3D. Specifically, we first design an adversarial attack that iteratively degrades the 2D and 3D perception capabilities of 3D object detection models(IDP), serves as the foundation for our subsequent defense mechanism. In response to this attack, we propose an uncertainty-based residual learning method for adversarial training. Our adversarial training approach capitalizes on the inherent uncertainty, enabling the model to significantly improve its robustness against adversarial attacks. We conducted extensive experiments on the KITTI 3D datasets, demonstrating that DART3D surpasses direct adversarial training (the most popular approach) under attacks in 3D object detection $AP_{R40}$ of car category for the Easy, Moderate, and Hard settings, with improvements of 4.415%, 4.112%, and 3.195%, respectively.

Viaarxiv icon

Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception

Sep 03, 2023
Zengxi Zhang, Zhiying Jiang, Zeru Shi, Jinyuan Liu, Risheng Liu

Figure 1 for Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception
Figure 2 for Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception
Figure 3 for Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception
Figure 4 for Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception

Due to the uneven scattering and absorption of different light wavelengths in aquatic environments, underwater images suffer from low visibility and clear color deviations. With the advancement of autonomous underwater vehicles, extensive research has been conducted on learning-based underwater enhancement algorithms. These works can generate visually pleasing enhanced images and mitigate the adverse effects of degraded images on subsequent perception tasks. However, learning-based methods are susceptible to the inherent fragility of adversarial attacks, causing significant disruption in results. In this work, we introduce a collaborative adversarial resilience network, dubbed CARNet, for underwater image enhancement and subsequent detection tasks. Concretely, we first introduce an invertible network with strong perturbation-perceptual abilities to isolate attacks from underwater images, preventing interference with image enhancement and perceptual tasks. Furthermore, we propose a synchronized attack training strategy with both visual-driven and perception-driven attacks enabling the network to discern and remove various types of attacks. Additionally, we incorporate an attack pattern discriminator to heighten the robustness of the network against different attacks. Extensive experiments demonstrate that the proposed method outputs visually appealing enhancement images and perform averagely 6.71% higher detection mAP than state-of-the-art methods.

* 9 pages, 9 figures 
Viaarxiv icon

Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

Sep 03, 2023
Zhu Liu, Zihang Chen, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework
Figure 2 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework
Figure 3 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework
Figure 4 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

The detection of small infrared targets against blurred and cluttered backgrounds has remained an enduring challenge. In recent years, learning-based schemes have become the mainstream methodology to establish the mapping directly. However, these methods are susceptible to the inherent complexities of changing backgrounds and real-world disturbances, leading to unreliable and compromised target estimations. In this work, we propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions. We first propose a bi-level optimization formulation to introduce dynamic adversarial learning. Specifically, it is composited by the learnable generation of corruptions to maximize the losses as the lower-level objective and the robustness promotion of detectors as the upper-level one. We also provide a hierarchical reinforced learning strategy to discover the most detrimental corruptions and balance the performance between robustness and accuracy. To better disentangle the corruptions from salient features, we also propose a spatial-frequency interaction network for target detection. Extensive experiments demonstrate our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark. The source codes are available at https://github.com/LiuZhu-CV/BALISTD.

* 9 pages, 6 figures 
Viaarxiv icon

Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Sep 02, 2023
Gehui Li, Jinyuan Liu, Long Ma, Zhiying Jiang, Xin Fan, Risheng Liu

Figure 1 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction
Figure 2 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction
Figure 3 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction
Figure 4 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Photographs taken with less-than-ideal exposure settings often display poor visual quality. Since the correction procedures vary significantly, it is difficult for a single neural network to handle all exposure problems. Moreover, the inherent limitations of convolutions, hinder the models ability to restore faithful color or details on extremely over-/under- exposed regions. To overcome these limitations, we propose a Macro-Micro-Hierarchical transformer, which consists of a macro attention to capture long-range dependencies, a micro attention to extract local features, and a hierarchical structure for coarse-to-fine correction. In specific, the complementary macro-micro attention designs enhance locality while allowing global interactions. The hierarchical structure enables the network to correct exposure errors of different scales layer by layer. Furthermore, we propose a contrast constraint and couple it seamlessly in the loss function, where the corrected image is pulled towards the positive sample and pushed away from the dynamically generated negative samples. Thus the remaining color distortion and loss of detail can be removed. We also extend our method as an image enhancer for low-light face recognition and low-light semantic segmentation. Experiments demonstrate that our approach obtains more attractive results than state-of-the-art methods quantitatively and qualitatively.

* Accepted by ACM MM 2023 
Viaarxiv icon

Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration

Aug 22, 2023
Di Wang, Jinyuan Liu, Long Ma, Risheng Liu, Xin Fan

Figure 1 for Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration
Figure 2 for Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration
Figure 3 for Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration
Figure 4 for Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration

Misalignments between multi-modality images pose challenges in image fusion, manifesting as structural distortions and edge ghosts. Existing efforts commonly resort to registering first and fusing later, typically employing two cascaded stages for registration,i.e., coarse registration and fine registration. Both stages directly estimate the respective target deformation fields. In this paper, we argue that the separated two-stage registration is not compact, and the direct estimation of the target deformation fields is not accurate enough. To address these challenges, we propose a Cross-modality Multi-scale Progressive Dense Registration (C-MPDR) scheme, which accomplishes the coarse-to-fine registration exclusively using a one-stage optimization, thus improving the fusion performance of misaligned multi-modality images. Specifically, two pivotal components are involved, a dense Deformation Field Fusion (DFF) module and a Progressive Feature Fine (PFF) module. The DFF aggregates the predicted multi-scale deformation sub-fields at the current scale, while the PFF progressively refines the remaining misaligned features. Both work together to accurately estimate the final deformation fields. In addition, we develop a Transformer-Conv-based Fusion (TCF) subnetwork that considers local and long-range feature dependencies, allowing us to capture more informative features from the registered infrared and visible images for the generation of high-quality fused images. Extensive experimental analysis demonstrates the superiority of the proposed method in the fusion of misaligned cross-modality images.

Viaarxiv icon

PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant Semantic Segmentation

Aug 08, 2023
Zhu Liu, Jinyuan Liu, Benzhuang Zhang, Long Ma, Xin Fan, Risheng Liu

Figure 1 for PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant Semantic Segmentation
Figure 2 for PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant Semantic Segmentation
Figure 3 for PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant Semantic Segmentation
Figure 4 for PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant Semantic Segmentation

Infrared and visible image fusion is a powerful technique that combines complementary information from different modalities for downstream semantic perception tasks. Existing learning-based methods show remarkable performance, but are suffering from the inherent vulnerability of adversarial attacks, causing a significant decrease in accuracy. In this work, a perception-aware fusion framework is proposed to promote segmentation robustness in adversarial scenes. We first conduct systematic analyses about the components of image fusion, investigating the correlation with segmentation robustness under adversarial perturbations. Based on these analyses, we propose a harmonized architecture search with a decomposition-based structure to balance standard accuracy and robustness. We also propose an adaptive learning strategy to improve the parameter robustness of image fusion, which can learn effective feature extraction under diverse adversarial perturbations. Thus, the goals of image fusion (\textit{i.e.,} extracting complementary features from source modalities and defending attack) can be realized from the perspectives of architectural and learning strategies. Extensive experimental results demonstrate that our scheme substantially enhances the robustness, with gains of 15.3% mIOU of segmentation in the adversarial scene, compared with advanced competitors. The source codes are available at https://github.com/LiuZhu-CV/PAIF.

* Accepted by ACM MM'2023;The source codes are available at https://github.com/LiuZhu-CV/PAIF 
Viaarxiv icon

Bilevel Generative Learning for Low-Light Vision

Aug 07, 2023
Yingchi Liu, Zhu Liu, Long Ma, Jinyuan Liu, Xin Fan, Zhongxuan Luo, Risheng Liu

Figure 1 for Bilevel Generative Learning for Low-Light Vision
Figure 2 for Bilevel Generative Learning for Low-Light Vision
Figure 3 for Bilevel Generative Learning for Low-Light Vision
Figure 4 for Bilevel Generative Learning for Low-Light Vision

Recently, there has been a growing interest in constructing deep learning schemes for Low-Light Vision (LLV). Existing techniques primarily focus on designing task-specific and data-dependent vision models on the standard RGB domain, which inherently contain latent data associations. In this study, we propose a generic low-light vision solution by introducing a generative block to convert data from the RAW to the RGB domain. This novel approach connects diverse vision problems by explicitly depicting data generation, which is the first in the field. To precisely characterize the latent correspondence between the generative procedure and the vision task, we establish a bilevel model with the parameters of the generative block defined as the upper level and the parameters of the vision task defined as the lower level. We further develop two types of learning strategies targeting different goals, namely low cost and high accuracy, to acquire a new bilevel generative learning paradigm. The generative blocks embrace a strong generalization ability in other low-light vision tasks through the bilevel optimization on enhancement tasks. Extensive experimental evaluations on three representative low-light vision tasks, namely enhancement, detection, and segmentation, fully demonstrate the superiority of our proposed approach. The code will be available at https://github.com/Yingchi1998/BGL.

* Accepted by ACM MM'2023, The code will be available at https://github.com/Yingchi1998/BGL 
Viaarxiv icon