Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning a Unified Degradation-aware Representation Model for Multi-modal Image Fusion

Mar 10, 2025

Haolong Ma, Hui Li, Chunyang Cheng, Zeyang Zhang, Xiaoning Song, Xiao-Jun Wu

Figure 1 for Learning a Unified Degradation-aware Representation Model for Multi-modal Image Fusion

Figure 2 for Learning a Unified Degradation-aware Representation Model for Multi-modal Image Fusion

Figure 3 for Learning a Unified Degradation-aware Representation Model for Multi-modal Image Fusion

Figure 4 for Learning a Unified Degradation-aware Representation Model for Multi-modal Image Fusion

Share this with someone who'll enjoy it:

Abstract:All-in-One Degradation-Aware Fusion Models (ADFMs), a class of multi-modal image fusion models, address complex scenes by mitigating degradations from source images and generating high-quality fused images. Mainstream ADFMs often rely on highly synthetic multi-modal multi-quality images for supervision, limiting their effectiveness in cross-modal and rare degradation scenarios. The inherent relationship among these multi-modal, multi-quality images of the same scene provides explicit supervision for training, but also raises above problems. To address these limitations, we present LURE, a Learning-driven Unified Representation model for infrared and visible Image Fusion, which is degradation-aware. LURE decouples multi-modal multi-quality data at the data level and recouples this relationship in a unified latent feature space (ULFS) by proposing a novel unified loss. This decoupling circumvents data-level limitations of prior models and allows leveraging real-world restoration datasets for training high-quality degradation-aware models, sidestepping above issues. To enhance text-image interaction, we refine image-text interaction and residual structures via Text-Guided Attention (TGA) and an inner residual structure. These enhances text's spatial perception of images and preserve more visual details. Experiments show our method outperforms state-of-the-art (SOTA) methods across general fusion, degradation-aware fusion, and downstream tasks. The code will be publicly available.

View paper on

Share this with someone who'll enjoy it:

Title:Learning a Unified Degradation-aware Representation Model for Multi-modal Image Fusion

Paper and Code