Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Cremers

Deep Combinatorial Aggregation

Oct 12, 2022

Yuesong Shen, Daniel Cremers

Figure 1 for Deep Combinatorial Aggregation

Figure 2 for Deep Combinatorial Aggregation

Figure 3 for Deep Combinatorial Aggregation

Figure 4 for Deep Combinatorial Aggregation

Abstract:Neural networks are known to produce poor uncertainty estimations, and a variety of approaches have been proposed to remedy this issue. This includes deep ensemble, a simple and effective method that achieves state-of-the-art results for uncertainty-aware learning tasks. In this work, we explore a combinatorial generalization of deep ensemble called deep combinatorial aggregation (DCA). DCA creates multiple instances of network components and aggregates their combinations to produce diversified model proposals and predictions. DCA components can be defined at different levels of granularity. And we discovered that coarse-grain DCAs can outperform deep ensemble for uncertainty-aware learning both in terms of predictive performance and uncertainty estimation. For fine-grain DCAs, we discover that an average parameterization approach named deep combinatorial weight averaging (DCWA) can improve the baseline training. It is on par with stochastic weight averaging (SWA) but does not require any custom training schedule or adaptation of BatchNorm layers. Furthermore, we propose a consistency enforcing loss that helps the training of DCWA and modelwise DCA. We experiment on in-domain, distributional shift, and out-of-distribution image classification tasks, and empirically confirm the effectiveness of DCWA and DCA approaches.

* NeurIPS 2022

Via

Access Paper or Ask Questions

What Makes Graph Neural Networks Miscalibrated?

Oct 12, 2022

Hans Hao-Hsun Hsu, Yuesong Shen, Christian Tomani, Daniel Cremers

Figure 1 for What Makes Graph Neural Networks Miscalibrated?

Figure 2 for What Makes Graph Neural Networks Miscalibrated?

Figure 3 for What Makes Graph Neural Networks Miscalibrated?

Figure 4 for What Makes Graph Neural Networks Miscalibrated?

Abstract:Given the importance of getting calibrated predictions and reliable uncertainty estimations, various post-hoc calibration methods have been developed for neural networks on standard multi-class classification tasks. However, these methods are not well suited for calibrating graph neural networks (GNNs), which presents unique challenges such as accounting for the graph structure and the graph-induced correlations between the nodes. In this work, we conduct a systematic study on the calibration qualities of GNN node predictions. In particular, we identify five factors which influence the calibration of GNNs: general under-confident tendency, diversity of nodewise predictive distributions, distance to training nodes, relative confidence level, and neighborhood similarity. Furthermore, based on the insights from this study, we design a novel calibration method named Graph Attention Temperature Scaling (GATS), which is tailored for calibrating graph neural networks. GATS incorporates designs that address all the identified influential factors and produces nodewise temperature scaling using an attention-based architecture. GATS is accuracy-preserving, data-efficient, and expressive at the same time. Our experiments empirically verify the effectiveness of GATS, demonstrating that it can consistently achieve state-of-the-art calibration results on various graph datasets for different GNN backbones.

* Accepted to NeurIPS 2022

Via

Access Paper or Ask Questions

DirectTracker: 3D Multi-Object Tracking Using Direct Image Alignment and Photometric Bundle Adjustment

Sep 29, 2022

Mariia Gladkova, Nikita Korobov, Nikolaus Demmel, Aljoša Ošep, Laura Leal-Taixé, Daniel Cremers

Figure 1 for DirectTracker: 3D Multi-Object Tracking Using Direct Image Alignment and Photometric Bundle Adjustment

Figure 2 for DirectTracker: 3D Multi-Object Tracking Using Direct Image Alignment and Photometric Bundle Adjustment

Figure 3 for DirectTracker: 3D Multi-Object Tracking Using Direct Image Alignment and Photometric Bundle Adjustment

Figure 4 for DirectTracker: 3D Multi-Object Tracking Using Direct Image Alignment and Photometric Bundle Adjustment

Abstract:Direct methods have shown excellent performance in the applications of visual odometry and SLAM. In this work we propose to leverage their effectiveness for the task of 3D multi-object tracking. To this end, we propose DirectTracker, a framework that effectively combines direct image alignment for the short-term tracking and sliding-window photometric bundle adjustment for 3D object detection. Object proposals are estimated based on the sparse sliding-window pointcloud and further refined using an optimization-based cost function that carefully combines 3D and 2D cues to ensure consistency in image and world space. We propose to evaluate 3D tracking using the recently introduced higher-order tracking accuracy (HOTA) metric and the generalized intersection over union similarity measure to mitigate the limitations of the conventional use of intersection over union for the evaluation of vision-based trackers. We perform evaluation on the KITTI Tracking benchmark for the Car class and show competitive performance in tracking objects both in 2D and 3D.

* In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2022

Via

Access Paper or Ask Questions

E-NeRF: Neural Radiance Fields from a Moving Event Camera

Aug 24, 2022

Simon Klenk, Lukas Koestler, Davide Scaramuzza, Daniel Cremers

Figure 1 for E-NeRF: Neural Radiance Fields from a Moving Event Camera

Figure 2 for E-NeRF: Neural Radiance Fields from a Moving Event Camera

Figure 3 for E-NeRF: Neural Radiance Fields from a Moving Event Camera

Figure 4 for E-NeRF: Neural Radiance Fields from a Moving Event Camera

Abstract:Estimating neural radiance fields (NeRFs) from ideal images has been extensively studied in the computer vision community. Most approaches assume optimal illumination and slow camera motion. These assumptions are often violated in robotic applications, where images contain motion blur and the scene may not have suitable illumination. This can cause significant problems for downstream tasks such as navigation, inspection or visualization of the scene. To alleviate these problems we present E-NeRF, the first method which estimates a volumetric scene representation in the form of a NeRF from a fast-moving event camera. Our method can recover NeRFs during very fast motion and in high dynamic range conditions, where frame-based approaches fail. We show that rendering high-quality frames is possible by only providing an event stream as input. Furthermore, by combining events and frames, we can estimate NeRFs of higher quality than state-of-the-art approaches under severe motion blur. We also show that combining events and frames can overcome failure cases of NeRF estimation in scenarios where only few input views are available, without requiring additional regularization.

Via

Access Paper or Ask Questions

Semantic Self-adaptation: Enhancing Generalization with a Single Sample

Aug 10, 2022

Sherwin Bahmani, Oliver Hahn, Eduard Zamfir, Nikita Araslanov, Daniel Cremers, Stefan Roth

Figure 1 for Semantic Self-adaptation: Enhancing Generalization with a Single Sample

Figure 2 for Semantic Self-adaptation: Enhancing Generalization with a Single Sample

Figure 3 for Semantic Self-adaptation: Enhancing Generalization with a Single Sample

Figure 4 for Semantic Self-adaptation: Enhancing Generalization with a Single Sample

Abstract:Despite years of research, out-of-domain generalization remains a critical weakness of deep networks for semantic segmentation. Previous studies relied on the assumption of a static model, i.e. once the training process is complete, model parameters remain fixed at test time. In this work, we challenge this premise with a self-adaptive approach for semantic segmentation that adjusts the inference process to each input sample. Self-adaptation operates on two levels. First, it employs a self-supervised loss that customizes the parameters of convolutional layers in the network to the input image. Second, in Batch Normalization layers, self-adaptation approximates the mean and the variance of the entire test data, which is assumed unavailable. It achieves this by interpolating between the training and the reference distribution derived from a single test sample. To empirically analyze our self-adaptive inference strategy, we develop and follow a rigorous evaluation protocol that addresses serious limitations of previous work. Our extensive analysis leads to a surprising conclusion: Using a standard training procedure, self-adaptation significantly outperforms strong baselines and sets new state-of-the-art accuracy on multi-domain benchmarks. Our study suggests that self-adaptive inference may complement the established practice of model regularization at training time for improving deep network generalization to out-of-domain data.

* Code and models: https://github.com/visinf/self-adaptive

Via

Access Paper or Ask Questions

Efficient and Flexible Sublabel-Accurate Energy Minimization

Jun 20, 2022

Zhakshylyk Nurlanov, Daniel Cremers, Florian Bernard

Figure 1 for Efficient and Flexible Sublabel-Accurate Energy Minimization

Figure 2 for Efficient and Flexible Sublabel-Accurate Energy Minimization

Figure 3 for Efficient and Flexible Sublabel-Accurate Energy Minimization

Figure 4 for Efficient and Flexible Sublabel-Accurate Energy Minimization

Abstract:We address the problem of minimizing a class of energy functions consisting of data and smoothness terms that commonly occur in machine learning, computer vision, and pattern recognition. While discrete optimization methods are able to give theoretical optimality guarantees, they can only handle a finite number of labels and therefore suffer from label discretization bias. Existing continuous optimization methods can find sublabel-accurate solutions, but they are not efficient for large label spaces. In this work, we propose an efficient sublabel-accurate method that utilizes the best properties of both continuous and discrete models. We separate the problem into two sequential steps: (i) global discrete optimization for selecting the label range, and (ii) efficient continuous sublabel-accurate local refinement of a convex approximation of the energy function in the chosen range. Doing so allows us to achieve a boost in time and memory efficiency while practically keeping the accuracy at the same level as continuous convex relaxation methods, and in addition, providing theoretical optimality guarantees at the level of discrete methods. Finally, we show the flexibility of the proposed approach to general pairwise smoothness terms, so that it is applicable to a wide range of regularizations. Experiments on the illustrating example of the image denoising problem demonstrate the properties of the proposed method. The code reproducing experiments is available at \url{https://github.com/nurlanov-zh/sublabel-accurate-alpha-expansion}.

Via

Access Paper or Ask Questions

Biologically Inspired Neural Path Finding

Jun 13, 2022

Hang Li, Qadeer Khan, Volker Tresp, Daniel Cremers

Figure 1 for Biologically Inspired Neural Path Finding

Figure 2 for Biologically Inspired Neural Path Finding

Figure 3 for Biologically Inspired Neural Path Finding

Figure 4 for Biologically Inspired Neural Path Finding

Abstract:The human brain can be considered to be a graphical structure comprising of tens of billions of biological neurons connected by synapses. It has the remarkable ability to automatically re-route information flow through alternate paths in case some neurons are damaged. Moreover, the brain is capable of retaining information and applying it to similar but completely unseen scenarios. In this paper, we take inspiration from these attributes of the brain, to develop a computational framework to find the optimal low cost path between a source node and a destination node in a generalized graph. We show that our framework is capable of handling unseen graphs at test time. Moreover, it can find alternate optimal paths, when nodes are arbitrarily added or removed during inference, while maintaining a fixed prediction time. Code is available here: https://github.com/hangligit/pathfinding

Via

Access Paper or Ask Questions

CHALLENGER: Training with Attribution Maps

May 30, 2022

Christian Tomani, Daniel Cremers

Figure 1 for CHALLENGER: Training with Attribution Maps

Figure 2 for CHALLENGER: Training with Attribution Maps

Figure 3 for CHALLENGER: Training with Attribution Maps

Figure 4 for CHALLENGER: Training with Attribution Maps

Abstract:We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance. Regularization is key in deep learning, especially when training complex models on relatively small datasets. In order to understand inner workings of neural networks, attribution methods such as Layer-wise Relevance Propagation (LRP) have been extensively studied, particularly for interpreting the relevance of input features. We introduce Challenger, a module that leverages the explainable power of attribution maps in order to manipulate particularly relevant input patterns. Therefore, exposing and subsequently resolving regions of ambiguity towards separating classes on the ground-truth data manifold, an issue that arises particularly when training models on rather small datasets. Our Challenger module increases model performance through building more diverse filters within the network and can be applied to any input data domain. We demonstrate that our approach results in substantially better classification as well as calibration performance on datasets with only a few samples up to datasets with thousands of samples. In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.

* Technical report

Via

Access Paper or Ask Questions

VPAIR -- Aerial Visual Place Recognition and Localization in Large-scale Outdoor Environments

May 23, 2022

Michael Schleiss, Fahmi Rouatbi, Daniel Cremers

Figure 1 for VPAIR -- Aerial Visual Place Recognition and Localization in Large-scale Outdoor Environments

Figure 2 for VPAIR -- Aerial Visual Place Recognition and Localization in Large-scale Outdoor Environments

Figure 3 for VPAIR -- Aerial Visual Place Recognition and Localization in Large-scale Outdoor Environments

Figure 4 for VPAIR -- Aerial Visual Place Recognition and Localization in Large-scale Outdoor Environments

Abstract:Visual Place Recognition and Visual Localization are essential components in navigation and mapping for autonomous vehicles especially in GNSS-denied navigation scenarios. Recent work has focused on ground or close to ground applications such as self-driving cars or indoor-scenarios and low-altitude drone flights. However, applications such as Urban Air Mobility require operations in large-scale outdoor environments at medium to high altitudes. We present a new dataset named VPAIR. The dataset was recorded on board a light aircraft flying at an altitude of more than 300 meters above ground capturing images with a downwardfacing camera. Each image is paired with a high resolution reference render including dense depth information and 6-DoF reference poses. The dataset covers a more than one hundred kilometers long trajectory over various types of challenging landscapes, e.g. urban, farmland and forests. Experiments on this dataset illustrate the challenges introduced by the change in perspective to a bird's eye view such as in-plane rotations.

* ICRA 2022 AERIAL ROBOTICS WORKSHOP

Via

Access Paper or Ask Questions

A Unified Framework for Implicit Sinkhorn Differentiation

May 13, 2022

Marvin Eisenberger, Aysim Toker, Laura Leal-Taixé, Florian Bernard, Daniel Cremers

Figure 1 for A Unified Framework for Implicit Sinkhorn Differentiation

Figure 2 for A Unified Framework for Implicit Sinkhorn Differentiation

Figure 3 for A Unified Framework for Implicit Sinkhorn Differentiation

Figure 4 for A Unified Framework for Implicit Sinkhorn Differentiation

Abstract:The Sinkhorn operator has recently experienced a surge of popularity in computer vision and related fields. One major reason is its ease of integration into deep learning frameworks. To allow for an efficient training of respective neural networks, we propose an algorithm that obtains analytical gradients of a Sinkhorn layer via implicit differentiation. In comparison to prior work, our framework is based on the most general formulation of the Sinkhorn operator. It allows for any type of loss function, while both the target capacities and cost matrices are differentiated jointly. We further construct error bounds of the resulting algorithm for approximate inputs. Finally, we demonstrate that for a number of applications, simply replacing automatic differentiation with our algorithm directly improves the stability and accuracy of the obtained gradients. Moreover, we show that it is computationally more efficient, particularly when resources like GPU memory are scarce.

* To appear at CVPR 2022

Via

Access Paper or Ask Questions