Alert button
Picture for Janis Keuper

Janis Keuper

Alert button

IMLA, Offenburg University

Retail-786k: a Large-Scale Dataset for Visual Entity Matching

Sep 29, 2023
Bianca Lamm, Janis Keuper

Figure 1 for Retail-786k: a Large-Scale Dataset for Visual Entity Matching
Figure 2 for Retail-786k: a Large-Scale Dataset for Visual Entity Matching
Figure 3 for Retail-786k: a Large-Scale Dataset for Visual Entity Matching
Figure 4 for Retail-786k: a Large-Scale Dataset for Visual Entity Matching

Entity Matching (EM) defines the task of learning to group objects by transferring semantic concepts from example groups (=entities) to unseen data. Despite the general availability of image data in the context of many EM-problems, most currently available EM-algorithms solely rely on (textual) meta data. In this paper, we introduce the first publicly available large-scale dataset for "visual entity matching", based on a production level use case in the retail domain. Using scanned advertisement leaflets, collected over several years from different European retailers, we provide a total of ~786k manually annotated, high resolution product images containing ~18k different individual retail products which are grouped into ~3k entities. The annotation of these product entities is based on a price comparison task, where each entity forms an equivalence class of comparable products. Following on a first baseline evaluation, we show that the proposed "visual entity matching" constitutes a novel learning problem which can not sufficiently be solved using standard image based classification and retrieval algorithms. Instead, novel approaches which allow to transfer example based visual equivalent classes to new data are needed to address the proposed problem. The aim of this paper is to provide a benchmark for such algorithms. Information about the dataset, evaluation code and download instructions are provided under https://www.retail-786k.org/.

Viaarxiv icon

Don't Look into the Sun: Adversarial Solarization Attacks on Image Classifiers

Aug 24, 2023
Paul Gavrikov, Janis Keuper

Assessing the robustness of deep neural networks against out-of-distribution inputs is crucial, especially in safety-critical domains like autonomous driving, but also in safety systems where malicious actors can digitally alter inputs to circumvent safety guards. However, designing effective out-of-distribution tests that encompass all possible scenarios while preserving accurate label information is a challenging task. Existing methodologies often entail a compromise between variety and constraint levels for attacks and sometimes even both. In a first step towards a more holistic robustness evaluation of image classification models, we introduce an attack method based on image solarization that is conceptually straightforward yet avoids jeopardizing the global structure of natural images independent of the intensity. Through comprehensive evaluations of multiple ImageNet models, we demonstrate the attack's capacity to degrade accuracy significantly, provided it is not integrated into the training augmentations. Interestingly, even then, no full immunity to accuracy deterioration is achieved. In other settings, the attack can often be simplified into a black-box attack with model-independent parameters. Defenses against other corruptions do not consistently extend to be effective against our specific attack. Project website: https://github.com/paulgavrikov/adversarial_solarization

Viaarxiv icon

On the Interplay of Convolutional Padding and Adversarial Robustness

Aug 12, 2023
Paul Gavrikov, Janis Keuper

Figure 1 for On the Interplay of Convolutional Padding and Adversarial Robustness
Figure 2 for On the Interplay of Convolutional Padding and Adversarial Robustness
Figure 3 for On the Interplay of Convolutional Padding and Adversarial Robustness
Figure 4 for On the Interplay of Convolutional Padding and Adversarial Robustness

It is common practice to apply padding prior to convolution operations to preserve the resolution of feature-maps in Convolutional Neural Networks (CNN). While many alternatives exist, this is often achieved by adding a border of zeros around the inputs. In this work, we show that adversarial attacks often result in perturbation anomalies at the image boundaries, which are the areas where padding is used. Consequently, we aim to provide an analysis of the interplay between padding and adversarial attacks and seek an answer to the question of how different padding modes (or their absence) affect adversarial robustness in various scenarios.

* Accepted as full paper at ICCV-W 2023 BRAVO 
Viaarxiv icon

Automating Wood Species Detection and Classification in Microscopic Images of Fibrous Materials with Deep Learning

Jul 24, 2023
Lars Nieradzik, Jördis Sieburg-Rockel, Stephanie Helmling, Janis Keuper, Thomas Weibel, Andrea Olbrich, Henrike Stephani

Figure 1 for Automating Wood Species Detection and Classification in Microscopic Images of Fibrous Materials with Deep Learning
Figure 2 for Automating Wood Species Detection and Classification in Microscopic Images of Fibrous Materials with Deep Learning
Figure 3 for Automating Wood Species Detection and Classification in Microscopic Images of Fibrous Materials with Deep Learning
Figure 4 for Automating Wood Species Detection and Classification in Microscopic Images of Fibrous Materials with Deep Learning

We have developed a methodology for the systematic generation of a large image dataset of macerated wood references, which we used to generate image data for nine hardwood genera. This is the basis for a substantial approach to automate, for the first time, the identification of hardwood species in microscopic images of fibrous materials by deep learning. Our methodology includes a flexible pipeline for easy annotation of vessel elements. We compare the performance of different neural network architectures and hyperparameters. Our proposed method performs similarly well to human experts. In the future, this will improve controls on global wood fiber product flows to protect forests.

Viaarxiv icon

Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality

Jul 20, 2023
Peter Lorenz, Ricard Durall, Janis Keuper

Figure 1 for Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality
Figure 2 for Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality
Figure 3 for Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality
Figure 4 for Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality

Diffusion models recently have been successfully applied for the visual synthesis of strikingly realistic appearing images. This raises strong concerns about their potential for malicious purposes. In this paper, we propose using the lightweight multi Local Intrinsic Dimensionality (multiLID), which has been originally developed in context of the detection of adversarial examples, for the automatic detection of synthetic images and the identification of the according generator networks. In contrast to many existing detection approaches, which often only work for GAN-generated images, the proposed method provides close to perfect detection results in many realistic use cases. Extensive experiments on known and newly created datasets demonstrate that the proposed multiLID approach exhibits superiority in diffusion detection and model identification. Since the empirical evaluations of recent publications on the detection of generated images are often mainly focused on the "LSUN-Bedroom" dataset, we further establish a comprehensive benchmark for the detection of diffusion-generated images, including samples from several diffusion models with different image sizes.

Viaarxiv icon

As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain

Jul 19, 2023
Julia Grabinski, Janis Keuper, Margret Keuper

Figure 1 for As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain
Figure 2 for As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain
Figure 3 for As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain
Figure 4 for As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain

Motivated by the recent trend towards the usage of larger receptive fields for more context-aware neural networks in vision applications, we aim to investigate how large these receptive fields really need to be. To facilitate such study, several challenges need to be addressed, most importantly: (i) We need to provide an effective way for models to learn large filters (potentially as large as the input data) without increasing their memory consumption during training or inference, (ii) the study of filter sizes has to be decoupled from other effects such as the network width or number of learnable parameters, and (iii) the employed convolution operation should be a plug-and-play module that can replace any conventional convolution in a Convolutional Neural Network (CNN) and allow for an efficient implementation in current frameworks. To facilitate such models, we propose to learn not spatial but frequency representations of filter weights as neural implicit functions, such that even infinitely large filters can be parameterized by only a few learnable weights. The resulting neural implicit frequency CNNs are the first models to achieve results on par with the state-of-the-art on large image classification benchmarks while executing convolutions solely in the frequency domain and can be employed within any CNN architecture. They allow us to provide an extensive analysis of the learned receptive fields. Interestingly, our analysis shows that, although the proposed networks could learn very large convolution kernels, the learned filters practically translate into well-localized and relatively small convolution kernels in the spatial domain.

Viaarxiv icon

Fix your downsampling ASAP! Be natively more robust via Aliasing and Spectral Artifact free Pooling

Jul 19, 2023
Julia Grabinski, Janis Keuper, Margret Keuper

Figure 1 for Fix your downsampling ASAP! Be natively more robust via Aliasing and Spectral Artifact free Pooling
Figure 2 for Fix your downsampling ASAP! Be natively more robust via Aliasing and Spectral Artifact free Pooling
Figure 3 for Fix your downsampling ASAP! Be natively more robust via Aliasing and Spectral Artifact free Pooling
Figure 4 for Fix your downsampling ASAP! Be natively more robust via Aliasing and Spectral Artifact free Pooling

Convolutional neural networks encode images through a sequence of convolutions, normalizations and non-linearities as well as downsampling operations into potentially strong semantic embeddings. Yet, previous work showed that even slight mistakes during sampling, leading to aliasing, can be directly attributed to the networks' lack in robustness. To address such issues and facilitate simpler and faster adversarial training, [12] recently proposed FLC pooling, a method for provably alias-free downsampling - in theory. In this work, we conduct a further analysis through the lens of signal processing and find that such current pooling methods, which address aliasing in the frequency domain, are still prone to spectral leakage artifacts. Hence, we propose aliasing and spectral artifact-free pooling, short ASAP. While only introducing a few modifications to FLC pooling, networks using ASAP as downsampling method exhibit higher native robustness against common corruptions, a property that FLC pooling was missing. ASAP also increases native robustness against adversarial attacks on high and low resolution data while maintaining similar clean accuracy or even outperforming the baseline.

Viaarxiv icon

On Invariance, Equivariance, Correlation and Convolution of Spherical Harmonic Representations for Scalar and Vectorial Data

Jul 06, 2023
Janis Keuper

Figure 1 for On Invariance, Equivariance, Correlation and Convolution of Spherical Harmonic Representations for Scalar and Vectorial Data
Figure 2 for On Invariance, Equivariance, Correlation and Convolution of Spherical Harmonic Representations for Scalar and Vectorial Data
Figure 3 for On Invariance, Equivariance, Correlation and Convolution of Spherical Harmonic Representations for Scalar and Vectorial Data
Figure 4 for On Invariance, Equivariance, Correlation and Convolution of Spherical Harmonic Representations for Scalar and Vectorial Data

The mathematical representations of data in the Spherical Harmonic (SH) domain has recently regained increasing interest in the machine learning community. This technical report gives an in-depth introduction to the theoretical foundation and practical implementation of SH representations, summarizing works on rotation invariant and equivariant features, as well as convolutions and exact correlations of signals on spheres. In extension, these methods are then generalized from scalar SH representations to Vectorial Harmonics (VH), providing the same capabilities for 3d vector fields on spheres

* 106 pages, tech report 
Viaarxiv icon