Detecting small to tiny targets in infrared images is a challenging task in computer vision, especially when it comes to differentiating these targets from noisy or textured backgrounds. Traditional object detection methods such as YOLO struggle to detect tiny objects compared to segmentation neural networks, resulting in weaker performance when detecting small targets. To reduce the number of false alarms while maintaining a high detection rate, we introduce an $\textit{a contrario}$ decision criterion into the training of a YOLO detector. The latter takes advantage of the $\textit{unexpectedness}$ of small targets to discriminate them from complex backgrounds. Adding this statistical criterion to a YOLOv7-tiny bridges the performance gap between state-of-the-art segmentation methods for infrared small target detection and object detection networks. It also significantly increases the robustness of YOLO towards few-shot settings.
The detection of small objects is a challenging task in computer vision. Conventional object detection methods have difficulty in finding the balance between high detection and low false alarm rates. In the literature, some methods have addressed this issue by enhancing the feature map responses, but without guaranteeing robustness with respect to the number of false alarms induced by background elements. To tackle this problem, we introduce an $\textit{a contrario}$ decision criterion into the learning process to take into account the unexpectedness of small objects. This statistic criterion enhances the feature map responses while controlling the number of false alarms (NFA) and can be integrated into any semantic segmentation neural network. Our add-on NFA module not only allows us to obtain competitive results for small target and crack detection tasks respectively, but also leads to more robust and interpretable results.
Deep learning based methods for single-image super-resolution (SR) have drawn a lot of attention lately. In particular, various papers have shown that the learning stage can be performed on a single image, resulting in the so-called internal approaches. The SinGAN method is one of these contributions, where the distribution of image patches is learnt on the image at hand and propagated at finer scales. Now, there are situations where some statistical a priori can be assumed for the final image. In particular, many natural phenomena yield images having power law Fourier spectrum, such as clouds and other texture images. In this work, we show how such a priori information can be integrated into an internal super-resolution approach, by constraining the learned up-sampling procedure of SinGAN. We consider various types of constraints, related to the Fourier power spectrum, the color histograms and the consistency of the upsampling scheme. We demonstrate on various experiments that these constraints are indeed satisfied, but also that some perceptual quality measures can be improved by the proposed approach.
We propose an auto-encoder architecture for multi-texture synthesis. The approach relies on both a compact encoder accounting for second order neural statistics and a generator incorporating adaptive periodic content. Images are embedded in a compact and geometrically consistent latent space, where the texture representation and its spatial organisation are disentangled. Texture synthesis and interpolation tasks can be performed directly from these latent codes. Our experiments demonstrate that our model outperforms state-of-the-art feed-forward methods in terms of visual quality and various texture related metrics.
Small target detection is an essential yet challenging task in defense applications, since differentiating low-contrast targets from natural textured and noisy environment remains difficult. To better take into account the contextual information, we propose to explore deep learning approaches based on attention mechanisms. Specifically, we propose a customized version of TransUnet including channel attention, which has shown a significant improvement in performance. Moreover, the lack of annotated data induces weak detection precision, leading to many false alarms. We thus explore a contrario methods in order to select meaningful potential targets detected by a weak deep learning training. -- La d\'etection de petites cibles est une probl\'ematique d\'elicate mais essentielle dans le domaine de la d\'efense, notamment lorsqu'il s'agit de diff\'erencier ces cibles d'un fond bruit\'e ou textur\'e, ou lorsqu'elles sont de faible contraste. Pour mieux prendre en compte les informations contextuelles, nous proposons d'explorer diff\'erentes approches de segmentation par apprentissage profond, dont certaines bas\'ees sur les m\'ecanismes d'attention. Nous proposons \'egalement d'inclure un module d'attention par canal au TransUnet, r\'eseau \`a l'\'etat de l'art, ce qui permet d'am\'eliorer significativement les performances. Par ailleurs, le manque de donn\'ees annot\'ees induit une perte en pr\'ecision lors des d\'etections, conduisant \`a de nombreuses fausses alarmes non pertinentes. Nous explorons donc des m\'ethodes a contrario afin de s\'electionner les cibles les plus significatives d\'etect\'ees par un r\'eseau entra\^in\'e avec peu de donn\'ees.