Abstract:We introduce ISALux, a novel transformer-based approach for Low-Light Image Enhancement (LLIE) that seamlessly integrates illumination and semantic priors. Our architecture includes an original self-attention block, Hybrid Illumination and Semantics-Aware Multi-Headed Self- Attention (HISA-MSA), which integrates illumination and semantic segmentation maps for en- hanced feature extraction. ISALux employs two self-attention modules to independently process illumination and semantic features, selectively enriching each other to regulate luminance and high- light structural variations in real-world scenarios. A Mixture of Experts (MoE)-based Feed-Forward Network (FFN) enhances contextual learning, with a gating mechanism conditionally activating the top K experts for specialized processing. To address overfitting in LLIE methods caused by distinct light patterns in benchmarking datasets, we enhance the HISA-MSA module with low-rank matrix adaptations (LoRA). Extensive qualitative and quantitative evaluations across multiple specialized datasets demonstrate that ISALux is competitive with state-of-the-art (SOTA) methods. Addition- ally, an ablation study highlights the contribution of each component in the proposed model. Code will be released upon publication.
Abstract:Low-light image enhancement (LLIE) is a fundamental yet challenging task due to the presence of noise, loss of detail, and poor contrast in images captured under insufficient lighting conditions. Recent methods often rely solely on pixel-level transformations of RGB images, neglecting the rich contextual information available from multiple visual modalities. In this paper, we present ModalFormer, the first large-scale multimodal framework for LLIE that fully exploits nine auxiliary modalities to achieve state-of-the-art performance. Our model comprises two main components: a Cross-modal Transformer (CM-T) designed to restore corrupted images while seamlessly integrating multimodal information, and multiple auxiliary subnetworks dedicated to multimodal feature reconstruction. Central to the CM-T is our novel Cross-modal Multi-headed Self-Attention mechanism (CM-MSA), which effectively fuses RGB data with modality-specific features--including deep feature embeddings, segmentation information, geometric cues, and color information--to generate information-rich hybrid attention maps. Extensive experiments on multiple benchmark datasets demonstrate ModalFormer's state-of-the-art performance in LLIE. Pre-trained models and results are made available at https://github.com/albrateanu/ModalFormer.
Abstract:This paper presents an overview of the NTIRE 2025 Image Denoising Challenge ({\sigma} = 50), highlighting the proposed methodologies and corresponding results. The primary objective is to develop a network architecture capable of achieving high-quality denoising performance, quantitatively evaluated using PSNR, without constraints on computational complexity or model size. The task assumes independent additive white Gaussian noise (AWGN) with a fixed noise level of 50. A total of 290 participants registered for the challenge, with 20 teams successfully submitting valid results, providing insights into the current state-of-the-art in image denoising.
Abstract:Parameter choosing in classical edge detection algorithms can be a costly and complex task. Choosing the correct parameters can improve considerably the resulting edge-map. In this paper we present a version of Edge Drawing algorithm in which we include an automated threshold choosing step. To better highlight the effect of this additional step we use different first order operators in the algorithm. Visual and statistical results are presented to sustain the benefits of the proposed automated threshold scheme.
Abstract:Edges are a basic and fundamental feature in image processing, that are used directly or indirectly in huge amount of applications. Inspired by the expansion of image resolution and processing power dilated convolution techniques appeared. Dilated convolution have impressive results in machine learning, we discuss here the idea of dilating the standard filters which are used in edge detection algorithms. In this work we try to put together all our previous and current results by using instead of the classical convolution filters a dilated one. We compare the results of the edge detection algorithms using the proposed dilation filters with original filters or custom variants. Experimental results confirm our statement that dilation of filters have positive impact for edge detection algorithms form simple to rather complex algorithms.
Abstract:Augmented Reality is an environment-enhancing technology, widely applied in many domains, such as tourism and culture. One of the major challenges in this field is precise detection and extraction of building information through Computer Vision techniques. Edge detection is one of the building blocks operations for many feature extraction solutions in Computer Vision. AR systems use edge detection for building extraction or for extraction of facade details from buildings. In this paper, we propose a novel filter operator for edge detection that aims to extract building contours or facade features better. The proposed filter gives more weight for finding vertical and horizontal edges that is an important feature for our aim.
Abstract:Edge detection is widely and fundamental feature used in various algorithms in computer vision to determine the edges in an image. The edge detection algorithm is used to determine the edges in an image which are further used by various algorithms from line detection to machine learning that can determine objects based on their contour. Inspired by new convolution techniques in machine learning we discuss here the idea of extending the standard Sobel kernels, which are used to compute the gradient of an image in order to find its edges. We compare the result of our custom extended filters with the results of the standard Sobel filter and other edge detection filters using different image sets and algorithms. We present statistical results regarding the custom extended Sobel filters improvements.