Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stephan Mandt

Predictive Querying for Autoregressive Neural Sequence Models

Oct 13, 2022
Alex Boyd, Sam Showalter, Stephan Mandt, Padhraic Smyth

Figure 1 for Predictive Querying for Autoregressive Neural Sequence Models

Figure 2 for Predictive Querying for Autoregressive Neural Sequence Models

Figure 3 for Predictive Querying for Autoregressive Neural Sequence Models

Figure 4 for Predictive Querying for Autoregressive Neural Sequence Models

In reasoning about sequential events it is natural to pose probabilistic queries such as "when will event A occur next" or "what is the probability of A occurring before B", with applications in areas such as user modeling, medicine, and finance. However, with machine learning shifting towards neural autoregressive models such as RNNs and transformers, probabilistic querying has been largely restricted to simple cases such as next-event prediction. This is in part due to the fact that future querying involves marginalization over large path spaces, which is not straightforward to do efficiently in such models. In this paper we introduce a general typology for predictive queries in neural autoregressive sequence models and show that such queries can be systematically represented by sets of elementary building blocks. We leverage this typology to develop new query estimation methods based on beam search, importance sampling, and hybrids. Across four large-scale sequence datasets from different application domains, as well as for the GPT-2 language model, we demonstrate the ability to make query answering tractable for arbitrary queries in exponentially-large predictive path-spaces, and find clear differences in cost-accuracy tradeoffs between search and sampling methods.

* Presented at the Conference on Neural Information Processing Systems (NeurIPs 2022)

Via

Access Paper or Ask Questions

Lossy Image Compression with Conditional Diffusion Models

Sep 14, 2022
Ruihan Yang, Stephan Mandt

Figure 1 for Lossy Image Compression with Conditional Diffusion Models

Figure 2 for Lossy Image Compression with Conditional Diffusion Models

Diffusion models are a new class of generative models that mark a milestone in high-quality image generation while relying on solid probabilistic principles. This makes them promising candidate models for neural image compression. This paper outlines an end-to-end optimized framework based on a conditional diffusion model for image compression. Besides latent variables inherent to the diffusion process, the model introduces an additional per-instance "content" latent variable to condition the denoising process. Upon decoding, the diffusion process conditionally generates/reconstructs an image using ancestral sampling. Our experiments show that this approach outperforms one of the best-performing conventional image codecs (BPG) and one neural codec on two compression benchmarks, where we focus on rate-perception tradeoffs. Qualitatively, our approach shows fewer decompression artifacts than the classical approach.

* Accepted at the ECCV 2022 Workshop on Uncertainty Quantification for Computer Vision

Via

Access Paper or Ask Questions

Raising the Bar in Graph-level Anomaly Detection

May 27, 2022
Chen Qiu, Marius Kloft, Stephan Mandt, Maja Rudolph

Figure 1 for Raising the Bar in Graph-level Anomaly Detection

Figure 2 for Raising the Bar in Graph-level Anomaly Detection

Figure 3 for Raising the Bar in Graph-level Anomaly Detection

Figure 4 for Raising the Bar in Graph-level Anomaly Detection

Graph-level anomaly detection has become a critical topic in diverse areas, such as financial fraud detection and detecting anomalous activities in social networks. While most research has focused on anomaly detection for visual data such as images, where high detection accuracies have been obtained, existing deep learning approaches for graphs currently show considerably worse performance. This paper raises the bar on graph-level anomaly detection, i.e., the task of detecting abnormal graphs in a set of graphs. By drawing on ideas from self-supervised learning and transformation learning, we present a new deep learning approach that significantly improves existing deep one-class approaches by fixing some of their known problems, including hypersphere collapse and performance flip. Experiments on nine real-world data sets involving nine techniques reveal that our method achieves an average performance improvement of 11.8% AUC compared to the best existing approach.

* To appear in IJCAI-ECAI 2022

Via

Access Paper or Ask Questions

Diffusion Probabilistic Modeling for Video Generation

Mar 31, 2022
Ruihan Yang, Prakhar Srivastava, Stephan Mandt

Figure 1 for Diffusion Probabilistic Modeling for Video Generation

Figure 2 for Diffusion Probabilistic Modeling for Video Generation

Figure 3 for Diffusion Probabilistic Modeling for Video Generation

Figure 4 for Diffusion Probabilistic Modeling for Video Generation

Denoising diffusion probabilistic models are a promising new class of generative models that are competitive with GANs on perceptual metrics. In this paper, we explore their potential for sequentially generating video. Inspired by recent advances in neural video compression, we use denoising diffusion models to stochastically generate a residual to a deterministic next-frame prediction. We compare this approach to two sequential VAE and two GAN baselines on four datasets, where we test the generated frames for perceptual quality and forecasting accuracy against ground truth frames. We find significant improvements in terms of perceptual quality on all data and improvements in terms of frame forecasting for complex high-resolution videos.

Via

Access Paper or Ask Questions

SC2: Supervised Compression for Split Computing

Mar 16, 2022
Yoshitomo Matsubara, Ruihan Yang, Marco Levorato, Stephan Mandt

Figure 1 for SC2: Supervised Compression for Split Computing

Figure 2 for SC2: Supervised Compression for Split Computing

Figure 3 for SC2: Supervised Compression for Split Computing

Figure 4 for SC2: Supervised Compression for Split Computing

Split computing distributes the execution of a neural network (e.g., for a classification task) between a mobile device and a more powerful edge server. A simple alternative to splitting the network is to carry out the supervised task purely on the edge server while compressing and transmitting the full data, and most approaches have barely outperformed this baseline. This paper proposes a new approach for discretizing and entropy-coding intermediate feature activations to efficiently transmit them from the mobile device to the edge server. We show that a efficient splittable network architecture results from a three-way tradeoff between (a) minimizing the computation on the mobile device, (b) minimizing the size of the data to be transmitted, and (c) maximizing the model's prediction performance. We propose an architecture based on this tradeoff and train the splittable network and entropy model in a knowledge distillation framework. In an extensive set of experiments involving three vision tasks, three datasets, nine baselines, and more than 180 trained models, we show that our approach improves supervised rate-distortion tradeoffs while maintaining a considerably smaller encoder size. We also release sc2bench, an installable Python package, to encourage and facilitate future studies on supervised compression for split computing (SC2).

* Preprint. Code and models are available at https://github.com/yoshitomo-matsubara/sc2-benchmark

Via

Access Paper or Ask Questions

Latent Outlier Exposure for Anomaly Detection with Contaminated Data

Feb 20, 2022
Chen Qiu, Aodong Li, Marius Kloft, Maja Rudolph, Stephan Mandt

Figure 1 for Latent Outlier Exposure for Anomaly Detection with Contaminated Data

Figure 2 for Latent Outlier Exposure for Anomaly Detection with Contaminated Data

Figure 3 for Latent Outlier Exposure for Anomaly Detection with Contaminated Data

Figure 4 for Latent Outlier Exposure for Anomaly Detection with Contaminated Data

Anomaly detection aims at identifying data points that show systematic deviations from the majority of data in an unlabeled dataset. A common assumption is that clean training data (free of anomalies) is available, which is often violated in practice. We propose a strategy for training an anomaly detector in the presence of unlabeled anomalies that is compatible with a broad class of models. The idea is to jointly infer binary labels to each datum (normal vs. anomalous) while updating the model parameters. Inspired by outlier exposure (Hendrycks et al., 2018) that considers synthetically created, labeled anomalies, we thereby use a combination of two losses that share parameters: one for the normal and one for the anomalous data. We then iteratively proceed with block coordinate updates on the parameters and the most likely (latent) labels. Our experiments with several backbone models on three image datasets, 30 tabular data sets, and a video anomaly detection benchmark showed consistent and significant improvements over the baselines.

Via

Access Paper or Ask Questions

Hybridizing Physical and Data-driven Prediction Methods for Physicochemical Properties

Feb 17, 2022
Fabian Jirasek, Robert Bamler, Stephan Mandt

Figure 1 for Hybridizing Physical and Data-driven Prediction Methods for Physicochemical Properties

Figure 2 for Hybridizing Physical and Data-driven Prediction Methods for Physicochemical Properties

Figure 3 for Hybridizing Physical and Data-driven Prediction Methods for Physicochemical Properties

We present a generic way to hybridize physical and data-driven methods for predicting physicochemical properties. The approach `distills' the physical method's predictions into a prior model and combines it with sparse experimental data using Bayesian inference. We apply the new approach to predict activity coefficients at infinite dilution and obtain significant improvements compared to the data-driven and physical baselines and established ensemble methods from the machine learning literature.

* Chemical Communications 56 12407, 2020
* Published version

Via

Access Paper or Ask Questions

An Introduction to Neural Data Compression

Feb 14, 2022
Yibo Yang, Stephan Mandt, Lucas Theis

Figure 1 for An Introduction to Neural Data Compression

Figure 2 for An Introduction to Neural Data Compression

Figure 3 for An Introduction to Neural Data Compression

Figure 4 for An Introduction to Neural Data Compression

Neural compression is the application of neural networks and other machine learning methods to data compression. While machine learning deals with many concepts closely related to compression, entering the field of neural compression can be difficult due to its reliance on information theory, perceptual metrics, and other knowledge specific to the field. This introduction hopes to fill in the necessary background by reviewing basic coding topics such as entropy coding and rate-distortion theory, related machine learning ideas such as bits-back coding and perceptual metrics, and providing a guide through the representative works in the literature so far.

Via

Access Paper or Ask Questions