Abstract:We introduce ISALux, a novel transformer-based approach for Low-Light Image Enhancement (LLIE) that seamlessly integrates illumination and semantic priors. Our architecture includes an original self-attention block, Hybrid Illumination and Semantics-Aware Multi-Headed Self- Attention (HISA-MSA), which integrates illumination and semantic segmentation maps for en- hanced feature extraction. ISALux employs two self-attention modules to independently process illumination and semantic features, selectively enriching each other to regulate luminance and high- light structural variations in real-world scenarios. A Mixture of Experts (MoE)-based Feed-Forward Network (FFN) enhances contextual learning, with a gating mechanism conditionally activating the top K experts for specialized processing. To address overfitting in LLIE methods caused by distinct light patterns in benchmarking datasets, we enhance the HISA-MSA module with low-rank matrix adaptations (LoRA). Extensive qualitative and quantitative evaluations across multiple specialized datasets demonstrate that ISALux is competitive with state-of-the-art (SOTA) methods. Addition- ally, an ablation study highlights the contribution of each component in the proposed model. Code will be released upon publication.
Abstract:Low-light image enhancement (LLIE) is a fundamental yet challenging task due to the presence of noise, loss of detail, and poor contrast in images captured under insufficient lighting conditions. Recent methods often rely solely on pixel-level transformations of RGB images, neglecting the rich contextual information available from multiple visual modalities. In this paper, we present ModalFormer, the first large-scale multimodal framework for LLIE that fully exploits nine auxiliary modalities to achieve state-of-the-art performance. Our model comprises two main components: a Cross-modal Transformer (CM-T) designed to restore corrupted images while seamlessly integrating multimodal information, and multiple auxiliary subnetworks dedicated to multimodal feature reconstruction. Central to the CM-T is our novel Cross-modal Multi-headed Self-Attention mechanism (CM-MSA), which effectively fuses RGB data with modality-specific features--including deep feature embeddings, segmentation information, geometric cues, and color information--to generate information-rich hybrid attention maps. Extensive experiments on multiple benchmark datasets demonstrate ModalFormer's state-of-the-art performance in LLIE. Pre-trained models and results are made available at https://github.com/albrateanu/ModalFormer.
Abstract:This work examines the findings of the NTIRE 2025 Shadow Removal Challenge. A total of 306 participants have registered, with 17 teams successfully submitting their solutions during the final evaluation phase. Following the last two editions, this challenge had two evaluation tracks: one focusing on reconstruction fidelity and the other on visual perception through a user study. Both tracks were evaluated with images from the WSRD+ dataset, simulating interactions between self- and cast-shadows with a large number of diverse objects, textures, and materials.
Abstract:This paper presents an overview of the NTIRE 2025 Image Denoising Challenge ({\sigma} = 50), highlighting the proposed methodologies and corresponding results. The primary objective is to develop a network architecture capable of achieving high-quality denoising performance, quantitatively evaluated using PSNR, without constraints on computational complexity or model size. The task assumes independent additive white Gaussian noise (AWGN) with a fixed noise level of 50. A total of 290 participants registered for the challenge, with 20 teams successfully submitting valid results, providing insights into the current state-of-the-art in image denoising.
Abstract:Image dehazing is an ill-posed problem that has been extensively studied in the recent years. The objective performance evaluation of the dehazing methods is one of the major obstacles due to the lacking of a reference dataset. While the synthetic datasets have shown important limitations, the few realistic datasets introduced recently assume homogeneous haze over the entire scene. Since in many real cases haze is not uniformly distributed we introduce NH-HAZE, a non-homogeneous realistic dataset with pairs of real hazy and corresponding haze-free images. This is the first non-homogeneous image dehazing dataset and contains 55 outdoor scenes. The non-homogeneous haze has been introduced in the scene using a professional haze generator that imitates the real conditions of hazy scenes. Additionally, this work presents an objective assessment of several state-of-the-art single image dehazing methods that were evaluated using NH-HAZE dataset.
Abstract:This paper reviews the NTIRE 2020 Challenge on NonHomogeneous Dehazing of images (restoration of rich details in hazy image). We focus on the proposed solutions and their results evaluated on NH-Haze, a novel dataset consisting of 55 pairs of real haze free and nonhomogeneous hazy images recorded outdoor. NH-Haze is the first realistic nonhomogeneous haze dataset that provides ground truth images. The nonhomogeneous haze has been produced using a professional haze generator that imitates the real conditions of haze scenes. 168 participants registered in the challenge and 27 teams competed in the final testing phase. The proposed solutions gauge the state-of-the-art in image dehazing.
Abstract:Single image dehazing is an ill-posed problem that has recently drawn important attention. Despite the significant increase in interest shown for dehazing over the past few years, the validation of the dehazing methods remains largely unsatisfactory, due to the lack of pairs of real hazy and corresponding haze-free reference images. To address this limitation, we introduce Dense-Haze - a novel dehazing dataset. Characterized by dense and homogeneous hazy scenes, Dense-Haze contains 33 pairs of real hazy and corresponding haze-free images of various outdoor scenes. The hazy scenes have been recorded by introducing real haze, generated by professional haze machines. The hazy and haze-free corresponding scenes contain the same visual content captured under the same illumination parameters. Dense-Haze dataset aims to push significantly the state-of-the-art in single-image dehazing by promoting robust methods for real and various hazy scenes. We also provide a comprehensive qualitative and quantitative evaluation of state-of-the-art single image dehazing techniques based on the Dense-Haze dataset. Not surprisingly, our study reveals that the existing dehazing techniques perform poorly for dense homogeneous hazy scenes and that there is still much room for improvement.
Abstract:Haze removal or dehazing is a challenging ill-posed problem that has drawn a significant attention in the last few years. Despite this growing interest, the scientific community is still lacking a reference dataset to evaluate objectively and quantitatively the performance of proposed dehazing methods. The few datasets that are currently considered, both for assessment and training of learning-based dehazing techniques, exclusively rely on synthetic hazy images. To address this limitation, we introduce the first outdoor scenes database (named O-HAZE) composed of pairs of real hazy and corresponding haze-free images. In practice, hazy images have been captured in presence of real haze, generated by professional haze machines, and OHAZE contains 45 different outdoor scenes depicting the same visual content recorded in haze-free and hazy conditions, under the same illumination parameters. To illustrate its usefulness, O-HAZE is used to compare a representative set of state-of-the-art dehazing techniques, using traditional image quality metrics such as PSNR, SSIM and CIEDE2000. This reveals the limitations of current techniques, and questions some of their underlying assumptions.
Abstract:Image dehazing has become an important computational imaging topic in the recent years. However, due to the lack of ground truth images, the comparison of dehazing methods is not straightforward, nor objective. To overcome this issue we introduce a new dataset -named I-HAZE- that contains 35 image pairs of hazy and corresponding haze-free (ground-truth) indoor images. Different from most of the existing dehazing databases, hazy images have been generated using real haze produced by a professional haze machine. For easy color calibration and improved assessment of dehazing algorithms, each scene include a MacBeth color checker. Moreover, since the images are captured in a controlled environment, both haze-free and hazy images are captured under the same illumination conditions. This represents an important advantage of the I-HAZE dataset that allows us to objectively compare the existing image dehazing techniques using traditional image quality metrics such as PSNR and SSIM.