Alert button
Picture for Michael Moeller

Michael Moeller

Alert button

SIGMA: Scale-Invariant Global Sparse Shape Matching

Aug 16, 2023
Maolin Gao, Paul Roetzer, Marvin Eisenberger, Zorah Lähner, Michael Moeller, Daniel Cremers, Florian Bernard

Figure 1 for SIGMA: Scale-Invariant Global Sparse Shape Matching
Figure 2 for SIGMA: Scale-Invariant Global Sparse Shape Matching
Figure 3 for SIGMA: Scale-Invariant Global Sparse Shape Matching
Figure 4 for SIGMA: Scale-Invariant Global Sparse Shape Matching

We propose a novel mixed-integer programming (MIP) formulation for generating precise sparse correspondences for highly non-rigid shapes. To this end, we introduce a projected Laplace-Beltrami operator (PLBO) which combines intrinsic and extrinsic geometric information to measure the deformation quality induced by predicted correspondences. We integrate the PLBO, together with an orientation-aware regulariser, into a novel MIP formulation that can be solved to global optimality for many practical problems. In contrast to previous methods, our approach is provably invariant to rigid transformations and global scaling, initialisation-free, has optimality guarantees, and scales to high resolution meshes with (empirically observed) linear time. We show state-of-the-art results for sparse non-rigid matching on several challenging 3D datasets, including data with inconsistent meshing, as well as applications in mesh-to-point-cloud matching.

* 14 pages 
Viaarxiv icon

An Evaluation of Zero-Cost Proxies -- from Neural Architecture Performance to Model Robustness

Jul 18, 2023
Jovita Lukasik, Michael Moeller, Margret Keuper

Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well-performing and robust architectures has received much less attention in the field of NAS. Therefore, the main focus of zero-cost proxies is the clean accuracy of architectures, whereas the model robustness should play an evenly important part. In this paper, we analyze the ability of common zero-cost proxies to serve as performance predictors for robustness in the popular NAS-Bench-201 search space. We are interested in the single prediction task for robustness and the joint multi-objective of clean and robust accuracy. We further analyze the feature importance of the proxies and show that predicting the robustness makes the prediction task from existing zero-cost proxies more challenging. As a result, the joint consideration of several proxies becomes necessary to predict a model's robustness while the clean accuracy can be regressed from a single such feature.

* Accepted at DAGM GCPR 2023 
Viaarxiv icon

Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters

Apr 28, 2023
Hendrik Sommerhoff, Shashank Agnihotri, Mohamed Saleh, Michael Moeller, Margret Keuper, Andreas Kolb

Figure 1 for Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters
Figure 2 for Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters
Figure 3 for Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters
Figure 4 for Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters

The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and graphics, treating all regions of an image as equally important. While several works have considered non-uniform, \eg, hexagonal or foveated, pixel layouts in hardware and image processing, the layout has not been integrated into the end-to-end learning paradigm so far. In this work, we present the first truly end-to-end trained imaging pipeline that optimizes the size and distribution of pixels on the imaging sensor jointly with the parameters of a given neural network on a specific task. We derive an analytic, differentiable approach for the sensor layout parameterization that allows for task-specific, local varying pixel resolutions. We present two pixel layout parameterization functions: rectangular and curvilinear grid shapes that retain a regular topology. We provide a drop-in module that approximates sensor simulation given existing high-resolution images to directly connect our method with existing deep learning models. We show that network predictions benefit from learnable pixel layouts for two different downstream tasks, classification and semantic segmentation.

Viaarxiv icon

WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity Recognition

Apr 11, 2023
Marius Bock, Michael Moeller, Kristof Van Laerhoven, Hilde Kuehne

Figure 1 for WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity Recognition
Figure 2 for WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity Recognition
Figure 3 for WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity Recognition
Figure 4 for WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity Recognition

Though research has shown the complementarity of camera- and inertial-based data, datasets which offer both modalities remain scarce. In this paper we introduce WEAR, a multimodal benchmark dataset for both vision- and wearable-based Human Activity Recognition (HAR). The dataset comprises data from 18 participants performing a total of 18 different workout activities with untrimmed inertial (acceleration) and camera (egocentric video) data recorded at 10 different outside locations. WEAR features a diverse set of activities which are low in inter-class similarity and, unlike previous egocentric datasets, not defined by human-object-interactions nor originate from inherently distinct activity categories. Provided benchmark results reveal that single-modality architectures have different strengths and weaknesses in their prediction performance. Further, in light of the recent success of transformer-based video action detection models, we demonstrate their versatility by applying them in a plain fashion using vision, inertial and combined (vision + inertial) features as input. Results show that vision transformers are not only able to produce competitive results using only inertial data, but also can function as an architecture to fuse both modalities by means of simple concatenation, with the multimodal approach being able to produce the highest average mAP, precision and close-to-best F1-scores. Up until now, vision-based transformers have neither been explored in inertial nor in multimodal human activity recognition, making our approach the first to do so. The dataset and code to reproduce experiments is publicly available via: mariusbock.github.io/wear

* 12 pages, 2 figures, 2 tables 
Viaarxiv icon

CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes

Mar 28, 2023
Harshil Bhatia, Edith Tretschk, Zorah Lähner, Marcel Seelbach Benkner, Michael Moeller, Christian Theobalt, Vladislav Golyanik

Figure 1 for CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes
Figure 2 for CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes
Figure 3 for CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes
Figure 4 for CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes

Jointly matching multiple, non-rigidly deformed 3D shapes is a challenging, $\mathcal{NP}$-hard problem. A perfect matching is necessarily cycle-consistent: Following the pairwise point correspondences along several shapes must end up at the starting vertex of the original shape. Unfortunately, existing quantum shape-matching methods do not support multiple shapes and even less cycle consistency. This paper addresses the open challenges and introduces the first quantum-hybrid approach for 3D shape multi-matching; in addition, it is also cycle-consistent. Its iterative formulation is admissible to modern adiabatic quantum hardware and scales linearly with the total number of input shapes. Both these characteristics are achieved by reducing the $N$-shape case to a sequence of three-shape matchings, the derivation of which is our main technical contribution. Thanks to quantum annealing, high-quality solutions with low energy are retrieved for the intermediate $\mathcal{NP}$-hard objectives. On benchmark datasets, the proposed approach significantly outperforms extensions to multi-shape matching of a previous quantum-hybrid two-shape matching method and is on-par with classical multi-matching methods.

* Computer Vision and Pattern Recognition (CVPR) 2023; 22 pages, 24 figures and 5 tables; Project page: https://4dqv.mpi-inf.mpg.de/CCuantuMM/ 
Viaarxiv icon

Convergent Data-driven Regularizations for CT Reconstruction

Dec 14, 2022
Samira Kabri, Alexander Auras, Danilo Riccio, Hartmut Bauermeister, Martin Benning, Michael Moeller, Martin Burger

Figure 1 for Convergent Data-driven Regularizations for CT Reconstruction
Figure 2 for Convergent Data-driven Regularizations for CT Reconstruction
Figure 3 for Convergent Data-driven Regularizations for CT Reconstruction
Figure 4 for Convergent Data-driven Regularizations for CT Reconstruction

The reconstruction of images from their corresponding noisy Radon transform is a typical example of an ill-posed linear inverse problem as arising in the application of computerized tomography (CT). As the (na\"{\i}ve) solution does not depend on the measured data continuously, regularization is needed to re-establish a continuous dependence. In this work, we investigate simple, but yet still provably convergent approaches to learning linear regularization methods from data. More specifically, we analyze two approaches: One generic linear regularization that learns how to manipulate the singular values of the linear operator in an extension of [1], and one tailored approach in the Fourier domain that is specific to CT-reconstruction. We prove that such approaches become convergent regularization methods as well as the fact that the reconstructions they provide are typically much smoother than the training data they were trained on. Finally, we compare the spectral as well as the Fourier-based approaches for CT-reconstruction numerically, discuss their advantages and disadvantages and investigate the effect of discretization errors at different resolutions.

Viaarxiv icon

QuAnt: Quantum Annealing with Learnt Couplings

Oct 13, 2022
Marcel Seelbach Benkner, Maximilian Krahn, Edith Tretschk, Zorah Lähner, Michael Moeller, Vladislav Golyanik

Figure 1 for QuAnt: Quantum Annealing with Learnt Couplings
Figure 2 for QuAnt: Quantum Annealing with Learnt Couplings
Figure 3 for QuAnt: Quantum Annealing with Learnt Couplings
Figure 4 for QuAnt: Quantum Annealing with Learnt Couplings

Modern quantum annealers can find high-quality solutions to combinatorial optimisation objectives given as quadratic unconstrained binary optimisation (QUBO) problems. Unfortunately, obtaining suitable QUBO forms in computer vision remains challenging and currently requires problem-specific analytical derivations. Moreover, such explicit formulations impose tangible constraints on solution encodings. In stark contrast to prior work, this paper proposes to learn QUBO forms from data through gradient backpropagation instead of deriving them. As a result, the solution encodings can be chosen flexibly and compactly. Furthermore, our methodology is general and virtually independent of the specifics of the target problem type. We demonstrate the advantages of learnt QUBOs on the diverse problem types of graph matching, 2D point cloud alignment and 3D rotation estimation. Our results are competitive with the previous quantum state of the art while requiring much fewer logical and physical qubits, enabling our method to scale to larger problems. The code and the new dataset will be open-sourced.

* incl. appendix 
Viaarxiv icon

On Adversarial Robustness of Deep Image Deblurring

Oct 05, 2022
Kanchana Vaishnavi Gandikota, Paramanand Chandramouli, Michael Moeller

Figure 1 for On Adversarial Robustness of Deep Image Deblurring
Figure 2 for On Adversarial Robustness of Deep Image Deblurring
Figure 3 for On Adversarial Robustness of Deep Image Deblurring
Figure 4 for On Adversarial Robustness of Deep Image Deblurring

Recent approaches employ deep learning-based solutions for the recovery of a sharp image from its blurry observation. This paper introduces adversarial attacks against deep learning-based image deblurring methods and evaluates the robustness of these neural networks to untargeted and targeted attacks. We demonstrate that imperceptible distortion can significantly degrade the performance of state-of-the-art deblurring networks, even producing drastically different content in the output, indicating the strong need to include adversarially robust training not only in classification but also for image recovery.

* ICIP 2022 
Viaarxiv icon

A Simple Strategy to Provable Invariance via Orbit Mapping

Sep 24, 2022
Kanchana Vaishnavi Gandikota, Jonas Geiping, Zorah Lähner, Adam Czapliński, Michael Moeller

Figure 1 for A Simple Strategy to Provable Invariance via Orbit Mapping
Figure 2 for A Simple Strategy to Provable Invariance via Orbit Mapping
Figure 3 for A Simple Strategy to Provable Invariance via Orbit Mapping
Figure 4 for A Simple Strategy to Provable Invariance via Orbit Mapping

Many applications require robustness, or ideally invariance, of neural networks to certain transformations of input data. Most commonly, this requirement is addressed by training data augmentation, using adversarial training, or defining network architectures that include the desired invariance by design. In this work, we propose a method to make network architectures provably invariant with respect to group actions by choosing one element from a (possibly continuous) orbit based on a fixed criterion. In a nutshell, we intend to 'undo' any possible transformation before feeding the data into the actual network. Further, we empirically analyze the properties of different approaches which incorporate invariance via training or architecture, and demonstrate the advantages of our method in terms of robustness and computational efficiency. In particular, we investigate the robustness with respect to rotations of images (which can hold up to discretization artifacts) as well as the provable orientation and scaling invariance of 3D point cloud classification.

* ACCV 2022, older version is titled "Training or Architecture? How to Incorporate Invariance in Neural Networks",(arXiv:2106.10044) 
Viaarxiv icon

Intrinsic Neural Fields: Learning Functions on Manifolds

Mar 23, 2022
Lukas Koestler, Daniel Grittner, Michael Moeller, Daniel Cremers, Zorah Lähner

Figure 1 for Intrinsic Neural Fields: Learning Functions on Manifolds
Figure 2 for Intrinsic Neural Fields: Learning Functions on Manifolds
Figure 3 for Intrinsic Neural Fields: Learning Functions on Manifolds
Figure 4 for Intrinsic Neural Fields: Learning Functions on Manifolds

Neural fields have gained significant attention in the computer vision community due to their excellent performance in novel view synthesis, geometry reconstruction, and generative modeling. Some of their advantages are a sound theoretic foundation and an easy implementation in current deep learning frameworks. While neural fields have been applied to signals on manifolds, e.g., for texture reconstruction, their representation has been limited to extrinsically embedding the shape into Euclidean space. The extrinsic embedding ignores known intrinsic manifold properties and is inflexible wrt. transfer of the learned function. To overcome these limitations, this work introduces intrinsic neural fields, a novel and versatile representation for neural fields on manifolds. Intrinsic neural fields combine the advantages of neural fields with the spectral properties of the Laplace-Beltrami operator. We show theoretically that intrinsic neural fields inherit many desirable properties of the extrinsic neural field framework but exhibit additional intrinsic qualities, like isometry invariance. In experiments, we show intrinsic neural fields can reconstruct high-fidelity textures from images with state-of-the-art quality and are robust to the discretization of the underlying manifold. We demonstrate the versatility of intrinsic neural fields by tackling various applications: texture transfer between deformed shapes & different shapes, texture reconstruction from real-world images with view dependence, and discretization-agnostic learning on meshes and point clouds.

Viaarxiv icon