Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bernt Schiele

Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators

Apr 16, 2021
David Stutz, Nandhini Chandramoorthy, Matthias Hein, Bernt Schiele

Figure 1 for Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators

Figure 2 for Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators

Figure 3 for Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators

Figure 4 for Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators

Deep neural network (DNN) accelerators received considerable attention in recent years due to the potential to save energy compared to mainstream hardware. Low-voltage operation of DNN accelerators allows to further reduce energy consumption significantly, however, causes bit-level failures in the memory storing the quantized DNN weights. Furthermore, DNN accelerators have been shown to be vulnerable to adversarial attacks on voltage controllers or individual bits. In this paper, we show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) or adversarial bit error training (AdvBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly. This leads not only to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators. Our approach generalizes across operating voltages and accelerators, as demonstrated on bit errors from profiled SRAM arrays, and achieves robustness against both targeted and untargeted bit-level attacks. Without losing more than 0.8%/2% in test accuracy, we can reduce energy consumption on CIFAR10 by 20%/30% for 8/4-bit quantization using RandBET. Allowing up to 320 adversarial bit errors, AdvBET reduces test error from above 90% (chance level) to 26.22% on CIFAR10.

* arXiv admin note: substantial text overlap with arXiv:2006.13977

Via

Access Paper or Ask Questions

Relating Adversarially Robust Generalization to Flat Minima

Apr 09, 2021
David Stutz, Matthias Hein, Bernt Schiele

Figure 1 for Relating Adversarially Robust Generalization to Flat Minima

Figure 2 for Relating Adversarially Robust Generalization to Flat Minima

Figure 3 for Relating Adversarially Robust Generalization to Flat Minima

Figure 4 for Relating Adversarially Robust Generalization to Flat Minima

Adversarial training (AT) has become the de-facto standard to obtain models robust against adversarial examples. However, AT exhibits severe robust overfitting: cross-entropy loss on adversarial examples, so-called robust loss, decreases continuously on training examples, while eventually increasing on test examples. In practice, this leads to poor robust generalization, i.e., adversarial robustness does not generalize well to new examples. In this paper, we study the relationship between robust generalization and flatness of the robust loss landscape in weight space, i.e., whether robust loss changes significantly when perturbing weights. To this end, we propose average- and worst-case metrics to measure flatness in the robust loss landscape and show a correlation between good robust generalization and flatness. For example, throughout training, flatness reduces significantly during overfitting such that early stopping effectively finds flatter minima in the robust loss landscape. Similarly, AT variants achieving higher adversarial robustness also correspond to flatter minima. This holds for many popular choices, e.g., AT-AWP, TRADES, MART, AT with self-supervision or additional unlabeled examples, as well as simple regularization techniques, e.g., AutoAugment, weight decay or label noise. For fair comparison across these approaches, our flatness measures are specifically designed to be scale-invariant and we conduct extensive experiments to validate our findings.

Via

Access Paper or Ask Questions

Convolutional Dynamic Alignment Networks for Interpretable Classifications

Mar 31, 2021
Moritz Böhle, Mario Fritz, Bernt Schiele

Figure 1 for Convolutional Dynamic Alignment Networks for Interpretable Classifications

Figure 2 for Convolutional Dynamic Alignment Networks for Interpretable Classifications

Figure 3 for Convolutional Dynamic Alignment Networks for Interpretable Classifications

Figure 4 for Convolutional Dynamic Alignment Networks for Interpretable Classifications

We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA-Nets), which are performant classifiers with a high degree of inherent interpretability. Their core building blocks are Dynamic Alignment Units (DAUs), which linearly transform their input with weight vectors that dynamically align with task-relevant patterns. As a result, CoDA-Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions. Given the alignment of the DAUs, the resulting contribution maps align with discriminative input patterns. These model-inherent decompositions are of high visual quality and outperform existing attribution methods under quantitative metrics. Further, CoDA-Nets constitute performant classifiers, achieving on par results to ResNet and VGG models on e.g. CIFAR-10 and TinyImagenet.

* Accepted at CVRP 2021

Via

Access Paper or Ask Questions

Deep Wiener Deconvolution: Wiener Meets Deep Learning for Image Deblurring

Mar 18, 2021
Jiangxin Dong, Stefan Roth, Bernt Schiele

Figure 1 for Deep Wiener Deconvolution: Wiener Meets Deep Learning for Image Deblurring

Figure 2 for Deep Wiener Deconvolution: Wiener Meets Deep Learning for Image Deblurring

Figure 3 for Deep Wiener Deconvolution: Wiener Meets Deep Learning for Image Deblurring

Figure 4 for Deep Wiener Deconvolution: Wiener Meets Deep Learning for Image Deblurring

We present a simple and effective approach for non-blind image deblurring, combining classical techniques and deep learning. In contrast to existing methods that deblur the image directly in the standard image space, we propose to perform an explicit deconvolution process in a feature space by integrating a classical Wiener deconvolution framework with learned deep features. A multi-scale feature refinement module then predicts the deblurred image from the deconvolved deep features, progressively recovering detail and small-scale structures. The proposed model is trained in an end-to-end manner and evaluated on scenarios with both simulated and real-world image blur. Our extensive experimental results show that the proposed deep Wiener deconvolution network facilitates deblurred results with visibly fewer artifacts. Moreover, our approach quantitatively outperforms state-of-the-art non-blind image deblurring methods by a wide margin.

* Accepted to NeurIPS 2020 as an oral presentation. Project page: https://gitlab.mpi-klsb.mpg.de/jdong/dwdn

Via

Access Paper or Ask Questions

Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes

Feb 01, 2021
Keyang Zhou, Bharat Lal Bhatnagar, Bernt Schiele, Gerard Pons-Moll

Figure 1 for Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes

Figure 2 for Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes

Figure 3 for Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes

Figure 4 for Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes

Most learning methods for 3D data (point clouds, meshes) suffer significant performance drops when the data is not carefully aligned to a canonical orientation. Aligning real world 3D data collected from different sources is non-trivial and requires manual intervention. In this paper, we propose the Adjoint Rigid Transform (ART) Network, a neural module which can be integrated with existing 3D networks to significantly boost their performance in tasks such as shape reconstruction, non-rigid registration, and latent disentanglement. ART learns to rotate input shapes to a canonical orientation that is crucial for a lot of tasks. ART achieves this by imposing rotation equivariance constraint on input shapes. The remarkable result is that with only self-supervision, ART can discover a unique canonical orientation for both rigid and nonrigid objects, which leads to a notable boost in downstream task performance. We will release our code and pre-trained models for further research.

Via

Access Paper or Ask Questions

You Only Need Adversarial Supervision for Semantic Image Synthesis

Dec 08, 2020
Vadim Sushko, Edgar Schönfeld, Dan Zhang, Juergen Gall, Bernt Schiele, Anna Khoreva

Figure 1 for You Only Need Adversarial Supervision for Semantic Image Synthesis

Figure 2 for You Only Need Adversarial Supervision for Semantic Image Synthesis

Figure 3 for You Only Need Adversarial Supervision for Semantic Image Synthesis

Figure 4 for You Only Need Adversarial Supervision for Semantic Image Synthesis

Despite their recent successes, GAN models for semantic image synthesis still suffer from poor image quality when trained with only adversarial supervision. Historically, additionally employing the VGG-based perceptual loss has helped to overcome this issue, significantly improving the synthesis quality, but at the same time limiting the progress of GAN models for semantic image synthesis. In this work, we propose a novel, simplified GAN model, which needs only adversarial supervision to achieve high quality results. We re-design the discriminator as a semantic segmentation network, directly using the given semantic label maps as the ground truth for training. By providing stronger supervision to the discriminator as well as to the generator through spatially- and semantically-aware discriminator feedback, we are able to synthesize images of higher fidelity with better alignment to their input label maps, making the use of the perceptual loss superfluous. Moreover, we enable high-quality multi-modal image synthesis through global and local sampling of a 3D noise tensor injected into the generator, which allows complete or partial image change. We show that images synthesized by our model are more diverse and follow the color and texture distributions of real images more closely. We achieve an average improvement of $6$ FID and $5$ mIoU points over the state of the art across different datasets using only adversarial supervision.

Via

Access Paper or Ask Questions

PoseTrackReID: Dataset Description

Nov 12, 2020
Andreas Doering, Di Chen, Shanshan Zhang, Bernt Schiele, Juergen Gall

Current datasets for video-based person re-identification (re-ID) do not include structural knowledge in form of human pose annotations for the persons of interest. Nonetheless, pose information is very helpful to disentangle useful feature information from background or occlusion noise. Especially real-world scenarios, such as surveillance, contain a lot of occlusions in human crowds or by obstacles. On the other hand, video-based person re-ID can benefit other tasks such as multi-person pose tracking in terms of robust feature matching. For that reason, we present PoseTrackReID, a large-scale dataset for multi-person pose tracking and video-based person re-ID. With PoseTrackReID, we want to bridge the gap between person re-ID and multi-person pose tracking. Additionally, this dataset provides a good benchmark for current state-of-the-art methods on multi-frame person re-ID.

Via

Access Paper or Ask Questions

Meta-Aggregating Networks for Class-Incremental Learning

Oct 10, 2020
Yaoyao Liu, Bernt Schiele, Qianru Sun

Figure 1 for Meta-Aggregating Networks for Class-Incremental Learning

Figure 2 for Meta-Aggregating Networks for Class-Incremental Learning

Figure 3 for Meta-Aggregating Networks for Class-Incremental Learning

Figure 4 for Meta-Aggregating Networks for Class-Incremental Learning

Class-Incremental Learning (CIL) aims to learn a classification model with the number of classes increasing phase-by-phase. The inherent problem in CIL is the stability-plasticity dilemma between the learning of old and new classes, i.e., high-plasticity models easily forget old classes but high-stability models are weak to learn new classes. We alleviate this issue by proposing a novel network architecture called Meta-Aggregating Networks (MANets) in which we explicitly build two residual blocks at each residual level (taking ResNet as the baseline architecture): a stable block and a plastic block. We aggregate the output feature maps from these two blocks and then feed the results to the next-level blocks. We meta-learn the aggregating weights in order to dynamically optimize and balance between two types of blocks, i.e., between stability and plasticity. We conduct extensive experiments on three CIL benchmarks: CIFAR-100, ImageNet-Subset, and ImageNet, and show that many existing CIL methods can be straightforwardly incorporated on the architecture of MANets to boost their performance.

* Code: https://github.com/yaoyao-liu/class-incremental-learning

Via

Access Paper or Ask Questions

Haar Wavelet based Block Autoregressive Flows for Trajectories

Sep 21, 2020
Apratim Bhattacharyya, Christoph-Nikolas Straehle, Mario Fritz, Bernt Schiele

Figure 1 for Haar Wavelet based Block Autoregressive Flows for Trajectories

Figure 2 for Haar Wavelet based Block Autoregressive Flows for Trajectories

Figure 3 for Haar Wavelet based Block Autoregressive Flows for Trajectories

Figure 4 for Haar Wavelet based Block Autoregressive Flows for Trajectories

Prediction of trajectories such as that of pedestrians is crucial to the performance of autonomous agents. While previous works have leveraged conditional generative models like GANs and VAEs for learning the likely future trajectories, accurately modeling the dependency structure of these multimodal distributions, particularly over long time horizons remains challenging. Normalizing flow based generative models can model complex distributions admitting exact inference. These include variants with split coupling invertible transformations that are easier to parallelize compared to their autoregressive counterparts. To this end, we introduce a novel Haar wavelet based block autoregressive model leveraging split couplings, conditioned on coarse trajectories obtained from Haar wavelet based transformations at different levels of granularity. This yields an exact inference method that models trajectories at different spatio-temporal resolutions in a hierarchical manner. We illustrate the advantages of our approach for generating diverse and accurate trajectories on two real-world datasets - Stanford Drone and Intersection Drone.

* German Conference on Pattern Recognition, 2020 (oral)

Via

Access Paper or Ask Questions

Synthetic Convolutional Features for Improved Semantic Segmentation

Sep 18, 2020
Yang He, Bernt Schiele, Mario Fritz

Figure 1 for Synthetic Convolutional Features for Improved Semantic Segmentation

Figure 2 for Synthetic Convolutional Features for Improved Semantic Segmentation

Figure 3 for Synthetic Convolutional Features for Improved Semantic Segmentation

Figure 4 for Synthetic Convolutional Features for Improved Semantic Segmentation

Recently, learning-based image synthesis has enabled to generate high-resolution images, either applying popular adversarial training or a powerful perceptual loss. However, it remains challenging to successfully leverage synthetic data for improving semantic segmentation with additional synthetic images. Therefore, we suggest to generate intermediate convolutional features and propose the first synthesis approach that is catered to such intermediate convolutional features. This allows us to generate new features from label masks and include them successfully into the training procedure in order to improve the performance of semantic segmentation. Experimental results and analysis on two challenging datasets Cityscapes and ADE20K show that our generated feature improves performance on segmentation tasks.

* ECCV 2020 Workshop on Assistive Computer Vision and Robotics

Via

Access Paper or Ask Questions