Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jacek Tabor

Molecule Attention Transformer

Feb 19, 2020
Łukasz Maziarka, Tomasz Danel, Sławomir Mucha, Krzysztof Rataj, Jacek Tabor, Stanisław Jastrzębski

Figure 1 for Molecule Attention Transformer

Figure 2 for Molecule Attention Transformer

Figure 3 for Molecule Attention Transformer

Figure 4 for Molecule Attention Transformer

Designing a single neural network architecture that performs competitively across a range of molecule property prediction tasks remains largely an open challenge, and its solution may unlock a widespread use of deep learning in the drug discovery industry. To move towards this goal, we propose Molecule Attention Transformer (MAT). Our key innovation is to augment the attention mechanism in Transformer using inter-atomic distances and the molecular graph structure. Experiments show that MAT performs competitively on a diverse set of molecular prediction tasks. Most importantly, with a simple self-supervised pretraining, MAT requires tuning of only a few hyperparameter values to achieve state-of-the-art performance on downstream tasks. Finally, we show that attention weights learned by MAT are interpretable from the chemical point of view.

Via

Access Paper or Ask Questions

LocoGAN -- Locally Convolutional GAN

Feb 18, 2020
Łukasz Struski, Szymon Knop, Jacek Tabor, Wiktor Daniec, Przemysław Spurek

Figure 1 for LocoGAN -- Locally Convolutional GAN

Figure 2 for LocoGAN -- Locally Convolutional GAN

Figure 3 for LocoGAN -- Locally Convolutional GAN

Figure 4 for LocoGAN -- Locally Convolutional GAN

In the paper we construct a fully convolutional GAN model: LocoGAN, which latent space is given by noise-like images of possibly different resolutions. The learning is local, i.e. we process not the whole noise-like image, but the sub-images of a fixed size. As a consequence LocoGAN can produce images of arbitrary dimensions e.g. LSUN bedroom data set. Another advantage of our approach comes from the fact that we use the position channels, which allows the generation of fully periodic (e.g. cylindrical panoramic images) or almost periodic ,,infinitely long" images (e.g. wall-papers).

Via

Access Paper or Ask Questions

Hypernetwork approach to generating point clouds

Feb 10, 2020
Przemysław Spurek, Sebastian Winczowski, Jacek Tabor, Maciej Zamorski, Maciej Zięba, Tomasz Trzciński

Figure 1 for Hypernetwork approach to generating point clouds

Figure 2 for Hypernetwork approach to generating point clouds

Figure 3 for Hypernetwork approach to generating point clouds

Figure 4 for Hypernetwork approach to generating point clouds

In this work, we propose a novel method for generating 3D point clouds that leverage properties of hyper networks. Contrary to the existing methods that learn only the representation of a 3D object, our approach simultaneously finds a representation of the object and its 3D surface. The main idea of our HyperCloud method is to build a hyper network that returns weights of a particular neural network (target network) trained to map points from a uniform unit ball distribution into a 3D shape. As a consequence, a particular 3D shape can be generated using point-by-point sampling from the assumed prior distribution and transforming sampled points with the target network. Since the hyper network is based on an auto-encoder architecture trained to reconstruct realistic 3D shapes, the target network weights can be considered a parametrization of the surface of a 3D shape, and not a standard representation of point cloud usually returned by competitive approaches. The proposed architecture allows finding mesh-based representation of 3D objects in a generative manner while providing point clouds en pair in quality with the state-of-the-art methods.

Via

Access Paper or Ask Questions

WICA: nonlinear weighted ICA

Jan 13, 2020
Andrzej Bedychaj, Przemysław Spurek, Aleksandra Nowak, Jacek Tabor

Figure 1 for WICA: nonlinear weighted ICA

Figure 2 for WICA: nonlinear weighted ICA

Figure 3 for WICA: nonlinear weighted ICA

Figure 4 for WICA: nonlinear weighted ICA

Independent Component Analysis (ICA) aims to find a coordinate system in which the components of the data are independent. In this paper we construct a new nonlinear ICA model, called WICA, which obtains better and more stable results than other algorithms. A crucial tool is given by a new efficient method of verifying nonlinear dependence with the use of computation of correlation coefficients for normally weighted data.

Via

Access Paper or Ask Questions

Biologically-Inspired Spatial Neural Networks

Oct 07, 2019
Maciej Wołczyk, Jacek Tabor, Marek Śmieja, Szymon Maszke

Figure 1 for Biologically-Inspired Spatial Neural Networks

Figure 2 for Biologically-Inspired Spatial Neural Networks

Figure 3 for Biologically-Inspired Spatial Neural Networks

We introduce bio-inspired artificial neural networks consisting of neurons that are additionally characterized by spatial positions. To simulate properties of biological systems we add the costs penalizing long connections and the proximity of neurons in a two-dimensional space. Our experiments show that in the case where the network performs two different tasks, the neurons naturally split into clusters, where each cluster is responsible for processing a different task. This behavior not only corresponds to the biological systems, but also allows for further insight into interpretability or continual learning.

Via

Access Paper or Ask Questions

Geometric Graph Convolutional Neural Networks

Sep 11, 2019
Przemysław Spurek, Tomasz Danel, Jacek Tabor, Marek Śmieja, Łukasz Struski, Agnieszka Słowik, Łukasz Maziarka

Figure 1 for Geometric Graph Convolutional Neural Networks

Figure 2 for Geometric Graph Convolutional Neural Networks

Figure 3 for Geometric Graph Convolutional Neural Networks

Figure 4 for Geometric Graph Convolutional Neural Networks

Graph Convolutional Networks (GCNs) have recently become the primary choice for learning from graph-structured data, superseding hash fingerprints in representing chemical compounds. However, GCNs lack the ability to take into account the ordering of node neighbors, even when there is a geometric interpretation of the graph vertices that provides an order based on their spatial positions. To remedy this issue, we propose Geometric Graph Convolutional Network (geo-GCN) which uses spatial features to efficiently learn from graphs that can be naturally located in space. Our contribution is threefold: we propose a GCN-inspired architecture which (i) leverages node positions, (ii) is a proper generalisation of both GCNs and Convolutional Neural Networks (CNNs), (iii) benefits from augmentation which further improves the performance and assures invariance with respect to the desired properties. Empirically, geo-GCN outperforms state-of-the-art graph-based methods on image classification and chemical tasks.

Via

Access Paper or Ask Questions

SeGMA: Semi-Supervised Gaussian Mixture Auto-Encoder

Jun 21, 2019
Marek Śmieja, Maciej Wołczyk, Jacek Tabor, Bernhard C. Geiger

Figure 1 for SeGMA: Semi-Supervised Gaussian Mixture Auto-Encoder

Figure 2 for SeGMA: Semi-Supervised Gaussian Mixture Auto-Encoder

Figure 3 for SeGMA: Semi-Supervised Gaussian Mixture Auto-Encoder

Figure 4 for SeGMA: Semi-Supervised Gaussian Mixture Auto-Encoder

We propose a semi-supervised generative model, SeGMA, which learns a joint probability distribution of data and their classes and which is implemented in a typical Wasserstein auto-encoder framework. We choose a mixture of Gaussians as a target distribution in latent space, which provides a natural splitting of data into clusters. To connect Gaussian components with correct classes, we use a small amount of labeled data and a Gaussian classifier induced by the target distribution. SeGMA is optimized efficiently due to the use of Cramer-Wold distance as a maximum mean discrepancy penalty, which yields a closed-form expression for a mixture of spherical Gaussian components and thus obviates the need of sampling. While SeGMA preserves all properties of its semi-supervised predecessors and achieves at least as good generative performance on standard benchmark data sets, it presents additional features: (a) interpolation between any pair of points in the latent space produces realistically-looking samples; (b) combining the interpolation property with disentangled class and style variables, SeGMA is able to perform a continuous style transfer from one class to another; (c) it is possible to change the intensity of class characteristics in a data point by moving the latent representation of the data point away from specific Gaussian components.

Via

Access Paper or Ask Questions

Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models

Jun 03, 2019
Paweł Morawiecki, Przemysław Spurek, Marek Śmieja, Jacek Tabor

Figure 1 for Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models

Figure 2 for Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models

Figure 3 for Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models

Figure 4 for Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models

We present an efficient technique, which allows to train classification networks which are verifiably robust against norm-bounded adversarial attacks. This framework is built upon the work of Gowal et al., who applies the interval arithmetic to bound the activations at each layer and keeps the prediction invariant to the input perturbation. While that method is faster than competitive approaches, it requires careful tuning of hyper-parameters and a large number of epochs to converge. To speed up and stabilize training, we supply the cost function with an additional term, which encourages the model to keep the interval bounds at hidden layers small. Experimental results demonstrate that we can achieve comparable (or even better) results using a smaller number of training iterations, in a more stable fashion. Moreover, the proposed model is not so sensitive to the exact specification of the training process, which makes it easier to use by practitioners.

Via

Access Paper or Ask Questions

Independent Component Analysis based on multiple data-weighting

May 31, 2019
Andrzej Bedychaj, Przemysław Spurek, Łukasz Struskim, Jacek Tabor

Figure 1 for Independent Component Analysis based on multiple data-weighting

Figure 2 for Independent Component Analysis based on multiple data-weighting

Figure 3 for Independent Component Analysis based on multiple data-weighting

Figure 4 for Independent Component Analysis based on multiple data-weighting

Independent Component Analysis (ICA) - one of the basic tools in data analysis - aims to find a coordinate system in which the components of the data are independent. In this paper we present Multiple-weighted Independent Component Analysis (MWeICA) algorithm, a new ICA method which is based on approximate diagonalization of weighted covariance matrices. Our idea is based on theoretical result, which says that linear independence of weighted data (for gaussian weights) guarantees independence. Experiments show that MWeICA achieves better results to most state-of-the-art ICA methods, with similar computational time.

Via

Access Paper or Ask Questions

One-element Batch Training by Moving Window

May 31, 2019
Przemysław Spurek, Szymon Knop, Jacek Tabor, Igor Podolak, Bartosz Wójcik

Figure 1 for One-element Batch Training by Moving Window

Figure 2 for One-element Batch Training by Moving Window

Figure 3 for One-element Batch Training by Moving Window

Figure 4 for One-element Batch Training by Moving Window

Several deep models, esp. the generative, compare the samples from two distributions (e.g. WAE like AutoEncoder models, set-processing deep networks, etc) in their cost functions. Using all these methods one cannot train the model directly taking small size (in extreme -- one element) batches, due to the fact that samples are to be compared. We propose a generic approach to training such models using one-element mini-batches. The idea is based on splitting the batch in latent into parts: previous, i.e. historical, elements used for latent space distribution matching and the current ones, used both for latent distribution computation and the minimization process. Due to the smaller memory requirements, this allows to train networks on higher resolution images then in the classical approach.

Via

Access Paper or Ask Questions