Alert button
Picture for Risheng Liu

Risheng Liu

Alert button

Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments

Sep 11, 2023
Jiaxin Gao, Ziyu Yue, Yaohua Liu, Sihan Xie, Xin Fan, Risheng Liu

Figure 1 for Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments
Figure 2 for Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments
Figure 3 for Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments
Figure 4 for Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments

Super-resolution tasks oriented to images captured in ultra-dark environments is a practical yet challenging problem that has received little attention. Due to uneven illumination and low signal-to-noise ratio in dark environments, a multitude of problems such as lack of detail and color distortion may be magnified in the super-resolution process compared to normal-lighting environments. Consequently, conventional low-light enhancement or super-resolution methods, whether applied individually or in a cascaded manner for such problem, often encounter limitations in recovering luminance, color fidelity, and intricate details. To conquer these issues, this paper proposes a specialized dual-modulated learning framework that, for the first time, attempts to deeply dissect the nature of the low-light super-resolution task. Leveraging natural image color characteristics, we introduce a self-regularized luminance constraint as a prior for addressing uneven lighting. Expanding on this, we develop Illuminance-Semantic Dual Modulation (ISDM) components to enhance feature-level preservation of illumination and color details. Besides, instead of deploying naive up-sampling strategies, we design the Resolution-Sensitive Merging Up-sampler (RSMU) module that brings together different sampling modalities as substrates, effectively mitigating the presence of artifacts and halos. Comprehensive experiments showcases the applicability and generalizability of our approach to diverse and challenging ultra-low-light conditions, outperforming state-of-the-art methods with a notable improvement (i.e., $\uparrow$5\% in PSNR, and $\uparrow$43\% in LPIPS). Especially noteworthy is the 19-fold increase in the RMSE score, underscoring our method's exceptional generalization across different darkness levels. The code will be available online upon publication of the paper.

* 9 pages 
Viaarxiv icon

Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation

Sep 07, 2023
Xiaohan Cui, Long Ma, Tengyu Ma, Jinyuan Liu, Xin Fan, Risheng Liu

Object detection in low-light scenarios has attracted much attention in the past few years. A mainstream and representative scheme introduces enhancers as the pre-processing for regular detectors. However, because of the disparity in task objectives between the enhancer and detector, this paradigm cannot shine at its best ability. In this work, we try to arouse the potential of enhancer + detector. Different from existing works, we extend the illumination-based enhancers (our newly designed or existing) as a scene decomposition module, whose removed illumination is exploited as the auxiliary in the detector for extracting detection-friendly features. A semantic aggregation module is further established for integrating multi-scale scene-related semantic information in the context space. Actually, our built scheme successfully transforms the "trash" (i.e., the ignored illumination in the detector) into the "treasure" for the detector. Plenty of experiments are conducted to reveal our superiority against other state-of-the-art methods. The code will be public if it is accepted.

Viaarxiv icon

Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Sep 03, 2023
Guanyao Wu, Hongming Fu, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion
Figure 2 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion
Figure 3 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion
Figure 4 for Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels. Despite its advancements, the field grapples with challenges, notably the reliance on manual designs for network structures and loss functions, and the constraints of utilizing simulated reference images as ground truths. Consequently, current methodologies often suffer from color distortions and exposure artifacts, further complicating the quest for authentic image representation. In addressing these challenges, this paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions. More specifically, we harnesses a unique dual research mechanism rooted in a novel weighted structure refinement architecture search. Besides, a hybrid supervised contrast constraint seamlessly guides and integrates with searching process, facilitating a more adaptive and comprehensive search for optimal loss functions. We realize the state-of-the-art performance in comparison to various competitive schemes, yielding a 10.61% and 4.38% improvement in Visual Information Fidelity (VIF) for general and no-reference scenarios, respectively, while providing results with high contrast, rich details and colors.

Viaarxiv icon

AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Sep 03, 2023
Xingyuan Li, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training
Figure 2 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training
Figure 3 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training
Figure 4 for AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

Monocular 3D object detection plays a pivotal role in the field of autonomous driving and numerous deep learning-based methods have made significant breakthroughs in this area. Despite the advancements in detection accuracy and efficiency, these models tend to fail when faced with such attacks, rendering them ineffective. Therefore, bolstering the adversarial robustness of 3D detection models has become a crucial issue that demands immediate attention and innovative solutions. To mitigate this issue, we propose a depth-aware robust adversarial training method for monocular 3D object detection, dubbed DART3D. Specifically, we first design an adversarial attack that iteratively degrades the 2D and 3D perception capabilities of 3D object detection models(IDP), serves as the foundation for our subsequent defense mechanism. In response to this attack, we propose an uncertainty-based residual learning method for adversarial training. Our adversarial training approach capitalizes on the inherent uncertainty, enabling the model to significantly improve its robustness against adversarial attacks. We conducted extensive experiments on the KITTI 3D datasets, demonstrating that DART3D surpasses direct adversarial training (the most popular approach) under attacks in 3D object detection $AP_{R40}$ of car category for the Easy, Moderate, and Hard settings, with improvements of 4.415%, 4.112%, and 3.195%, respectively.

Viaarxiv icon

Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception

Sep 03, 2023
Zengxi Zhang, Zhiying Jiang, Zeru Shi, Jinyuan Liu, Risheng Liu

Figure 1 for Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception
Figure 2 for Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception
Figure 3 for Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception
Figure 4 for Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception

Due to the uneven scattering and absorption of different light wavelengths in aquatic environments, underwater images suffer from low visibility and clear color deviations. With the advancement of autonomous underwater vehicles, extensive research has been conducted on learning-based underwater enhancement algorithms. These works can generate visually pleasing enhanced images and mitigate the adverse effects of degraded images on subsequent perception tasks. However, learning-based methods are susceptible to the inherent fragility of adversarial attacks, causing significant disruption in results. In this work, we introduce a collaborative adversarial resilience network, dubbed CARNet, for underwater image enhancement and subsequent detection tasks. Concretely, we first introduce an invertible network with strong perturbation-perceptual abilities to isolate attacks from underwater images, preventing interference with image enhancement and perceptual tasks. Furthermore, we propose a synchronized attack training strategy with both visual-driven and perception-driven attacks enabling the network to discern and remove various types of attacks. Additionally, we incorporate an attack pattern discriminator to heighten the robustness of the network against different attacks. Extensive experiments demonstrate that the proposed method outputs visually appealing enhancement images and perform averagely 6.71% higher detection mAP than state-of-the-art methods.

* 9 pages, 9 figures 
Viaarxiv icon

Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

Sep 03, 2023
Zhu Liu, Zihang Chen, Jinyuan Liu, Long Ma, Xin Fan, Risheng Liu

Figure 1 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework
Figure 2 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework
Figure 3 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework
Figure 4 for Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework

The detection of small infrared targets against blurred and cluttered backgrounds has remained an enduring challenge. In recent years, learning-based schemes have become the mainstream methodology to establish the mapping directly. However, these methods are susceptible to the inherent complexities of changing backgrounds and real-world disturbances, leading to unreliable and compromised target estimations. In this work, we propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions. We first propose a bi-level optimization formulation to introduce dynamic adversarial learning. Specifically, it is composited by the learnable generation of corruptions to maximize the losses as the lower-level objective and the robustness promotion of detectors as the upper-level one. We also provide a hierarchical reinforced learning strategy to discover the most detrimental corruptions and balance the performance between robustness and accuracy. To better disentangle the corruptions from salient features, we also propose a spatial-frequency interaction network for target detection. Extensive experiments demonstrate our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark. The source codes are available at https://github.com/LiuZhu-CV/BALISTD.

* 9 pages, 6 figures 
Viaarxiv icon

ASF-Net: Robust Video Deraining via Temporal Alignment and Online Adaptive Learning

Sep 02, 2023
Xinwei Xue, Jia He, Long Ma, Xiangyu Meng, Wenlin Li, Risheng Liu

Figure 1 for ASF-Net: Robust Video Deraining via Temporal Alignment and Online Adaptive Learning
Figure 2 for ASF-Net: Robust Video Deraining via Temporal Alignment and Online Adaptive Learning
Figure 3 for ASF-Net: Robust Video Deraining via Temporal Alignment and Online Adaptive Learning
Figure 4 for ASF-Net: Robust Video Deraining via Temporal Alignment and Online Adaptive Learning

In recent times, learning-based methods for video deraining have demonstrated commendable results. However, there are two critical challenges that these methods are yet to address: exploiting temporal correlations among adjacent frames and ensuring adaptability to unknown real-world scenarios. To overcome these challenges, we explore video deraining from a paradigm design perspective to learning strategy construction. Specifically, we propose a new computational paradigm, Alignment-Shift-Fusion Network (ASF-Net), which incorporates a temporal shift module. This module is novel to this field and provides deeper exploration of temporal information by facilitating the exchange of channel-level information within the feature space. To fully discharge the model's characterization capability, we further construct a LArge-scale RAiny video dataset (LARA) which also supports the development of this community. On the basis of the newly-constructed dataset, we explore the parameters learning process by developing an innovative re-degraded learning strategy. This strategy bridges the gap between synthetic and real-world scenes, resulting in stronger scene adaptability. Our proposed approach exhibits superior performance in three benchmarks and compelling visual quality in real-world scenarios, underscoring its efficacy. The code is available at https://github.com/vis-opt-group/ASF-Net.

Viaarxiv icon

Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Sep 02, 2023
Gehui Li, Jinyuan Liu, Long Ma, Zhiying Jiang, Xin Fan, Risheng Liu

Figure 1 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction
Figure 2 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction
Figure 3 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction
Figure 4 for Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

Photographs taken with less-than-ideal exposure settings often display poor visual quality. Since the correction procedures vary significantly, it is difficult for a single neural network to handle all exposure problems. Moreover, the inherent limitations of convolutions, hinder the models ability to restore faithful color or details on extremely over-/under- exposed regions. To overcome these limitations, we propose a Macro-Micro-Hierarchical transformer, which consists of a macro attention to capture long-range dependencies, a micro attention to extract local features, and a hierarchical structure for coarse-to-fine correction. In specific, the complementary macro-micro attention designs enhance locality while allowing global interactions. The hierarchical structure enables the network to correct exposure errors of different scales layer by layer. Furthermore, we propose a contrast constraint and couple it seamlessly in the loss function, where the corrected image is pulled towards the positive sample and pushed away from the dynamically generated negative samples. Thus the remaining color distortion and loss of detail can be removed. We also extend our method as an image enhancer for low-light face recognition and low-light semantic segmentation. Experiments demonstrate that our approach obtains more attractive results than state-of-the-art methods quantitatively and qualitatively.

* Accepted by ACM MM 2023 
Viaarxiv icon

Learning Heavily-Degraded Prior for Underwater Object Detection

Aug 24, 2023
Chenping Fu, Xin Fan, Jiewen Xiao, Wanqi Yuan, Risheng Liu, Zhongxuan Luo

Figure 1 for Learning Heavily-Degraded Prior for Underwater Object Detection
Figure 2 for Learning Heavily-Degraded Prior for Underwater Object Detection
Figure 3 for Learning Heavily-Degraded Prior for Underwater Object Detection
Figure 4 for Learning Heavily-Degraded Prior for Underwater Object Detection

Underwater object detection suffers from low detection performance because the distance and wavelength dependent imaging process yield evident image quality degradations such as haze-like effects, low visibility, and color distortions. Therefore, we commit to resolving the issue of underwater object detection with compounded environmental degradations. Typical approaches attempt to develop sophisticated deep architecture to generate high-quality images or features. However, these methods are only work for limited ranges because imaging factors are either unstable, too sensitive, or compounded. Unlike these approaches catering for high-quality images or features, this paper seeks transferable prior knowledge from detector-friendly images. The prior guides detectors removing degradations that interfere with detection. It is based on statistical observations that, the heavily degraded regions of detector-friendly (DFUI) and underwater images have evident feature distribution gaps while the lightly degraded regions of them overlap each other. Therefore, we propose a residual feature transference module (RFTM) to learn a mapping between deep representations of the heavily degraded patches of DFUI- and underwater- images, and make the mapping as a heavily degraded prior (HDP) for underwater detection. Since the statistical properties are independent to image content, HDP can be learned without the supervision of semantic labels and plugged into popular CNNbased feature extraction networks to improve their performance on underwater object detection. Without bells and whistles, evaluations on URPC2020 and UODD show that our methods outperform CNN-based detectors by a large margin. Our method with higher speeds and less parameters still performs better than transformer-based detectors. Our code and DFUI dataset can be found in https://github.com/xiaoDetection/Learning-Heavily-Degraed-Prior.

Viaarxiv icon

Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration

Aug 22, 2023
Di Wang, Jinyuan Liu, Long Ma, Risheng Liu, Xin Fan

Figure 1 for Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration
Figure 2 for Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration
Figure 3 for Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration
Figure 4 for Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration

Misalignments between multi-modality images pose challenges in image fusion, manifesting as structural distortions and edge ghosts. Existing efforts commonly resort to registering first and fusing later, typically employing two cascaded stages for registration,i.e., coarse registration and fine registration. Both stages directly estimate the respective target deformation fields. In this paper, we argue that the separated two-stage registration is not compact, and the direct estimation of the target deformation fields is not accurate enough. To address these challenges, we propose a Cross-modality Multi-scale Progressive Dense Registration (C-MPDR) scheme, which accomplishes the coarse-to-fine registration exclusively using a one-stage optimization, thus improving the fusion performance of misaligned multi-modality images. Specifically, two pivotal components are involved, a dense Deformation Field Fusion (DFF) module and a Progressive Feature Fine (PFF) module. The DFF aggregates the predicted multi-scale deformation sub-fields at the current scale, while the PFF progressively refines the remaining misaligned features. Both work together to accurately estimate the final deformation fields. In addition, we develop a Transformer-Conv-based Fusion (TCF) subnetwork that considers local and long-range feature dependencies, allowing us to capture more informative features from the registered infrared and visible images for the generation of high-quality fused images. Extensive experimental analysis demonstrates the superiority of the proposed method in the fusion of misaligned cross-modality images.

Viaarxiv icon