Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Max Welling

Convolutional Networks for Spherical Signals

Sep 15, 2017
Taco Cohen, Mario Geiger, Jonas Köhler, Max Welling

Figure 1 for Convolutional Networks for Spherical Signals

Figure 2 for Convolutional Networks for Spherical Signals

The success of convolutional networks in learning problems involving planar signals such as images is due to their ability to exploit the translation symmetry of the data distribution through weight sharing. Many areas of science and egineering deal with signals with other symmetries, such as rotation invariant data on the sphere. Examples include climate and weather science, astrophysics, and chemistry. In this paper we present spherical convolutional networks. These networks use convolutions on the sphere and rotation group, which results in rotational weight sharing and rotation equivariance. Using a synthetic spherical MNIST dataset, we show that spherical convolutional networks are very effective at dealing with rotationally invariant classification problems.

Via

Access Paper or Ask Questions

The Variational Fair Autoencoder

Aug 10, 2017
Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel

Figure 1 for The Variational Fair Autoencoder

Figure 2 for The Variational Fair Autoencoder

Figure 3 for The Variational Fair Autoencoder

Figure 4 for The Variational Fair Autoencoder

We investigate the problem of learning representations that are invariant to certain nuisance or sensitive factors of variation in the data while retaining as much of the remaining information as possible. Our model is based on a variational autoencoding architecture with priors that encourage independence between sensitive and latent factors of variation. Any subsequent processing, such as classification, can then be performed on this purged latent representation. To remove any remaining dependencies we incorporate an additional penalty term based on the "Maximum Mean Discrepancy" (MMD) measure. We discuss how these architectures can be efficiently trained on data and show in experiments that this method is more effective than previous work in removing unwanted sources of variation while maintaining informative latent representations.

* Fixed typo in eq. 3 and 4

Via

Access Paper or Ask Questions

Improving Variational Auto-Encoders using convex combination linear Inverse Autoregressive Flow

Jun 14, 2017
Jakub M. Tomczak, Max Welling

Figure 1 for Improving Variational Auto-Encoders using convex combination linear Inverse Autoregressive Flow

Figure 2 for Improving Variational Auto-Encoders using convex combination linear Inverse Autoregressive Flow

In this paper, we propose a new volume-preserving flow and show that it performs similarly to the linear general normalizing flow. The idea is to enrich a linear Inverse Autoregressive Flow by introducing multiple lower-triangular matrices with ones on the diagonal and combining them using a convex combination. In the experimental studies on MNIST and Histopathology data we show that the proposed approach outperforms other volume-preserving flows and is competitive with current state-of-the-art linear normalizing flow.

* Published at Benelearn 2017 (Eindhoven, the Netherlands)

Via

Access Paper or Ask Questions

Temporally Efficient Deep Learning with Spikes

Jun 13, 2017
Peter O'Connor, Efstratios Gavves, Max Welling

Figure 1 for Temporally Efficient Deep Learning with Spikes

Figure 2 for Temporally Efficient Deep Learning with Spikes

Figure 3 for Temporally Efficient Deep Learning with Spikes

Figure 4 for Temporally Efficient Deep Learning with Spikes

The vast majority of natural sensory data is temporally redundant. Video frames or audio samples which are sampled at nearby points in time tend to have similar values. Typically, deep learning algorithms take no advantage of this redundancy to reduce computation. This can be an obscene waste of energy. We present a variant on backpropagation for neural networks in which computation scales with the rate of change of the data - not the rate at which we process the data. We do this by having neurons communicate a combination of their state, and their temporal change in state. Intriguingly, this simple communication rule give rise to units that resemble biologically-inspired leaky integrate-and-fire neurons, and to a weight-update rule that is equivalent to a form of Spike-Timing Dependent Plasticity (STDP), a synaptic learning rule observed in the brain. We demonstrate that on MNIST and a temporal variant of MNIST, our algorithm performs about as well as a Multilayer Perceptron trained with backpropagation, despite only communicating discrete values between layers.

* 8 pages + references and appendix

Via

Access Paper or Ask Questions

Recurrent Inference Machines for Solving Inverse Problems

Jun 13, 2017
Patrick Putzky, Max Welling

Figure 1 for Recurrent Inference Machines for Solving Inverse Problems

Figure 2 for Recurrent Inference Machines for Solving Inverse Problems

Figure 3 for Recurrent Inference Machines for Solving Inverse Problems

Figure 4 for Recurrent Inference Machines for Solving Inverse Problems

Much of the recent research on solving iterative inference problems focuses on moving away from hand-chosen inference algorithms and towards learned inference. In the latter, the inference process is unrolled in time and interpreted as a recurrent neural network (RNN) which allows for joint learning of model and inference parameters with back-propagation through time. In this framework, the RNN architecture is directly derived from a hand-chosen inference algorithm, effectively limiting its capabilities. We propose a learning framework, called Recurrent Inference Machines (RIM), in which we turn algorithm construction the other way round: Given data and a task, train an RNN to learn an inference algorithm. Because RNNs are Turing complete [1, 2] they are capable to implement any inference algorithm. The framework allows for an abstraction which removes the need for domain knowledge. We demonstrate in several image restoration experiments that this abstraction is effective, allowing us to achieve state-of-the-art performance on image denoising and super-resolution tasks and superior across-task generalization.

Via

Access Paper or Ask Questions

Multiplicative Normalizing Flows for Variational Bayesian Neural Networks

Jun 12, 2017
Christos Louizos, Max Welling

Figure 1 for Multiplicative Normalizing Flows for Variational Bayesian Neural Networks

Figure 2 for Multiplicative Normalizing Flows for Variational Bayesian Neural Networks

Figure 3 for Multiplicative Normalizing Flows for Variational Bayesian Neural Networks

Figure 4 for Multiplicative Normalizing Flows for Variational Bayesian Neural Networks

We reinterpret multiplicative noise in neural networks as auxiliary random variables that augment the approximate posterior in a variational setting for Bayesian neural networks. We show that through this interpretation it is both efficient and straightforward to improve the approximation by employing normalizing flows while still allowing for local reparametrizations and a tractable lower bound. In experiments we show that with this new approximation we can significantly improve upon classical mean field for Bayesian neural networks on both predictive accuracy as well as predictive uncertainty.

* Appearing at the International Conference on Machine Learning (ICML) 2017

Via

Access Paper or Ask Questions

A New Method to Visualize Deep Neural Networks

Jun 12, 2017
Luisa M. Zintgraf, Taco S. Cohen, Max Welling

Figure 1 for A New Method to Visualize Deep Neural Networks

Figure 2 for A New Method to Visualize Deep Neural Networks

Figure 3 for A New Method to Visualize Deep Neural Networks

Figure 4 for A New Method to Visualize Deep Neural Networks

We present a method for visualising the response of a deep neural network to a specific input. For image data for instance our method will highlight areas that provide evidence in favor of, and against choosing a certain class. The method overcomes several shortcomings of previous methods and provides great additional insight into the decision making process of convolutional networks, which is important both to improve models and to accelerate the adoption of such methods in e.g. medicine. In experiments on ImageNet data, we illustrate how the method works and can be applied in different ways to understand deep neural nets.

* Please note that this version of the article is outdated. The new version (published at ICLR2017) includes additional experiments on MRI scans and can be found at arXiv:1702.04595

Via

Access Paper or Ask Questions

Soft Weight-Sharing for Neural Network Compression

May 09, 2017
Karen Ullrich, Edward Meeds, Max Welling

Figure 1 for Soft Weight-Sharing for Neural Network Compression

Figure 2 for Soft Weight-Sharing for Neural Network Compression

Figure 3 for Soft Weight-Sharing for Neural Network Compression

Figure 4 for Soft Weight-Sharing for Neural Network Compression

The success of deep learning in numerous application domains created the de- sire to run and train them on mobile devices. This however, conflicts with their computationally, memory and energy intense nature, leading to a growing interest in compression. Recent work by Han et al. (2015a) propose a pipeline that involves retraining, pruning and quantization of neural network weights, obtaining state-of-the-art compression rates. In this paper, we show that competitive compression rates can be achieved by using a version of soft weight-sharing (Nowlan & Hinton, 1992). Our method achieves both quantization and pruning in one simple (re-)training procedure. This point of view also exposes the relation between compression and the minimum description length (MDL) principle.

* ICLR2017

Via

Access Paper or Ask Questions

Semi-Supervised Classification with Graph Convolutional Networks

Feb 22, 2017
Thomas N. Kipf, Max Welling

Figure 1 for Semi-Supervised Classification with Graph Convolutional Networks

Figure 2 for Semi-Supervised Classification with Graph Convolutional Networks

Figure 3 for Semi-Supervised Classification with Graph Convolutional Networks

Figure 4 for Semi-Supervised Classification with Graph Convolutional Networks

We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden layer representations that encode both local graph structure and features of nodes. In a number of experiments on citation networks and on a knowledge graph dataset we demonstrate that our approach outperforms related methods by a significant margin.

* Published as a conference paper at ICLR 2017

Via

Access Paper or Ask Questions

Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

Feb 15, 2017
Luisa M Zintgraf, Taco S Cohen, Tameem Adel, Max Welling

Figure 1 for Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

Figure 2 for Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

Figure 3 for Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

Figure 4 for Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

This article presents the prediction difference analysis method for visualizing the response of a deep neural network to a specific input. When classifying images, the method highlights areas in a given input image that provide evidence for or against a certain class. It overcomes several shortcoming of previous methods and provides great additional insight into the decision making process of classifiers. Making neural network decisions interpretable through visualization is important both to improve models and to accelerate the adoption of black-box classifiers in application areas such as medicine. We illustrate the method in experiments on natural images (ImageNet data), as well as medical images (MRI brain scans).

* ICLR2017

Via

Access Paper or Ask Questions