Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Robert Jenssen

Deep Divergence-Based Approach to Clustering

Feb 13, 2019
Michael Kampffmeyer, Sigurd Løkse, Filippo M. Bianchi, Lorenzo Livi, Arnt-Børre Salberg, Robert Jenssen

Figure 1 for Deep Divergence-Based Approach to Clustering

Figure 2 for Deep Divergence-Based Approach to Clustering

Figure 3 for Deep Divergence-Based Approach to Clustering

Figure 4 for Deep Divergence-Based Approach to Clustering

A promising direction in deep learning research consists in learning representations and simultaneously discovering cluster structure in unlabeled data by optimizing a discriminative loss function. As opposed to supervised deep learning, this line of research is in its infancy, and how to design and optimize suitable loss functions to train deep neural networks for clustering is still an open question. Our contribution to this emerging field is a new deep clustering network that leverages the discriminative power of information-theoretic divergence measures, which have been shown to be effective in traditional clustering. We propose a novel loss function that incorporates geometric regularization constraints, thus avoiding degenerate structures of the resulting clustering partition. Experiments on synthetic benchmarks and real datasets show that the proposed network achieves competitive performance with respect to other state-of-the-art methods, scales well to large datasets, and does not require pre-training steps.

Via

Access Paper or Ask Questions

Recurrent Deep Divergence-based Clustering for simultaneous feature learning and clustering of variable length time series

Nov 29, 2018
Daniel J. Trosten, Andreas S. Strauman, Michael Kampffmeyer, Robert Jenssen

Figure 1 for Recurrent Deep Divergence-based Clustering for simultaneous feature learning and clustering of variable length time series

Figure 2 for Recurrent Deep Divergence-based Clustering for simultaneous feature learning and clustering of variable length time series

Figure 3 for Recurrent Deep Divergence-based Clustering for simultaneous feature learning and clustering of variable length time series

Figure 4 for Recurrent Deep Divergence-based Clustering for simultaneous feature learning and clustering of variable length time series

The task of clustering unlabeled time series and sequences entails a particular set of challenges, namely to adequately model temporal relations and variable sequence lengths. If these challenges are not properly handled, the resulting clusters might be of suboptimal quality. As a key solution, we present a joint clustering and feature learning framework for time series based on deep learning. For a given set of time series, we train a recurrent network to represent, or embed, each time series in a vector space such that a divergence-based clustering loss function can discover the underlying cluster structure in an end-to-end manner. Unlike previous approaches, our model inherently handles multivariate time series of variable lengths and does not require specification of a distance-measure in the input space. On a diverse set of benchmark datasets we illustrate that our proposed Recurrent Deep Divergence-based Clustering approach outperforms, or performs comparable to, previous approaches.

Via

Access Paper or Ask Questions

Reservoir computing approaches for representation and classification of multivariate time series

Nov 06, 2018
Filippo Maria Bianchi, Simone Scardapane, Sigurd Løkse, Robert Jenssen

Figure 1 for Reservoir computing approaches for representation and classification of multivariate time series

Figure 2 for Reservoir computing approaches for representation and classification of multivariate time series

Figure 3 for Reservoir computing approaches for representation and classification of multivariate time series

Figure 4 for Reservoir computing approaches for representation and classification of multivariate time series

Classification of multivariate time series (MTS) has been tackled with a large variety of methodologies and applied to a wide range of scenarios. Among the existing approaches, reservoir computing (RC) techniques, which implement a fixed and high-dimensional recurrent network to process sequential data, are computationally efficient tools to generate a vectorial, fixed-size representation of the MTS that can be further processed by standard classifiers. Despite their unrivaled training speed, MTS classifiers based on a standard RC architecture fail to achieve the same accuracy of other classifiers, such as those exploiting fully trainable recurrent networks. In this paper we introduce the reservoir model space, an RC approach to learn vectorial representations of MTS in an unsupervised fashion. Each MTS is encoded within the parameters of a linear model trained to predict a low-dimensional embedding of the reservoir dynamics. Our model space yields a powerful representation of the MTS and, thanks to an intermediate dimensionality reduction procedure, attains computational performance comparable to other RC methods. As a second contribution we propose a modular RC framework for MTS classification, with an associated open source Python library. By combining the different modules it is possible to seamlessly implement advanced RC architectures, including our proposed unsupervised representation, bidirectional reservoirs, and non-linear readouts, such as deep neural networks with both fixed and flexible activation functions. Results obtained on benchmark and real-world MTS datasets show that RC classifiers are dramatically faster and, when implemented using our proposed representation, also achieve superior classification accuracy.

Via

Access Paper or Ask Questions

Understanding Convolutional Neural Network Training with Information Theory

Oct 12, 2018
Shujian Yu, Kristoffer Wickstrøm, Robert Jenssen, Jose C. Principe

Figure 1 for Understanding Convolutional Neural Network Training with Information Theory

Figure 2 for Understanding Convolutional Neural Network Training with Information Theory

Figure 3 for Understanding Convolutional Neural Network Training with Information Theory

Figure 4 for Understanding Convolutional Neural Network Training with Information Theory

Using information theoretic concepts to understand and explore the inner organization of deep neural networks (DNNs) remains a big challenge. Recently, the concept of an information plane (coupled with the famed information bottleneck principle) began to shed light on the analysis of multilayer perceptrons (MLPs). We provided an in-depth insight into stacked autoencoders (SAEs) using a novel matrix-based Renyi's {\alpha}-entropy functional, enabling for the first time the analysis of the dynamics of learning using information flow in the real-world scenario involving complex network architecture and large data. Despite the great potential of these past works, there are several open questions when it comes to applying information theoretic concepts to understand convolutional neural networks (CNNs). These include for instance the accurate estimation of information quantities among multiple variables, and the many different training methodologies. By extending the novel matrix-based Renyi's {\alpha}-entropy functional to a multivariate scenario and introducing the partial information decomposition (PID) framework, this paper presents a systematic method to analyze CNNs training using information theory. Our results validate two fundamental data processing inequalities in CNNs, and also reveals some fundamental issues embedded in the training phase of CNNs.

* substantial improvement over v1

Via

Access Paper or Ask Questions

Multivariate Extension of Matrix-based Renyi's α-order Entropy Functional

Aug 23, 2018
Shujian Yu, Luis Gonzalo Sanchez Giraldo, Robert Jenssen, Jose C. Principe

Figure 1 for Multivariate Extension of Matrix-based Renyi's α-order Entropy Functional

Figure 2 for Multivariate Extension of Matrix-based Renyi's α-order Entropy Functional

Figure 3 for Multivariate Extension of Matrix-based Renyi's α-order Entropy Functional

Figure 4 for Multivariate Extension of Matrix-based Renyi's α-order Entropy Functional

The matrix-based Renyi's {\alpha}-order entropy functional was recently introduced using the normalized eigenspectrum of an Hermitian matrix of the projected data in the reproducing kernel Hilbert space (RKHS). However, the current theory in the matrix-based Renyi's {\alpha}-order entropy functional only defines the entropy of a single variable or mutual information between two random variables. In information theory and machine learning communities, one is also frequently interested in multivariate information quantities, such as the multivariate joint entropy and different interactive quantities among multiple variables. In this paper, we first define the matrix-based Renyi's {\alpha}-order joint entropy among multiple variables. We then show how this definition can ease the estimation of various information quantities that measure the interactions among multiple variables, such as interactive information and total correlation. We finally present an application to feature selection to show how our definition provides a simple yet powerful way to estimate a widely-acknowledged intractable quantity from data. A real example on hyperspectral image (HSI) band selection is also provided.

* 26 pages, 8 figures

Via

Access Paper or Ask Questions

The Deep Kernelized Autoencoder

Jul 23, 2018
Michael Kampffmeyer, Sigurd Løkse, Filippo M. Bianchi, Robert Jenssen, Lorenzo Livi

Figure 1 for The Deep Kernelized Autoencoder

Figure 2 for The Deep Kernelized Autoencoder

Figure 3 for The Deep Kernelized Autoencoder

Figure 4 for The Deep Kernelized Autoencoder

Autoencoders learn data representations (codes) in such a way that the input is reproduced at the output of the network. However, it is not always clear what kind of properties of the input data need to be captured by the codes. Kernel machines have experienced great success by operating via inner-products in a theoretically well-defined reproducing kernel Hilbert space, hence capturing topological properties of input data. In this paper, we enhance the autoencoder's ability to learn effective data representations by aligning inner products between codes with respect to a kernel matrix. By doing so, the proposed kernelized autoencoder allows learning similarity-preserving embeddings of input data, where the notion of similarity is explicitly controlled by the user and encoded in a positive semi-definite kernel matrix. Experiments are performed for evaluating both reconstruction and kernel alignment performance in classification tasks and visualization of high-dimensional data. Additionally, we show that our method is capable to emulate kernel principal component analysis on a denoising task, obtaining competitive results at a much lower computational cost.

* This work extends the preliminary (conference) version of this paper (arXiv:1702.02526), Applied Soft Computing, Elsevier, 2018

Via

Access Paper or Ask Questions

An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting

Jul 20, 2018
Filippo Maria Bianchi, Enrico Maiorino, Michael C. Kampffmeyer, Antonello Rizzi, Robert Jenssen

Figure 1 for An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting

Figure 2 for An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting

Figure 3 for An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting

Figure 4 for An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting

The key component in forecasting demand and consumption of resources in a supply network is an accurate prediction of real-valued time series. Indeed, both service interruptions and resource waste can be reduced with the implementation of an effective forecasting system. Significant research has thus been devoted to the design and development of methodologies for short term load forecasting over the past decades. A class of mathematical models, called Recurrent Neural Networks, are nowadays gaining renewed interest among researchers and they are replacing many practical implementation of the forecasting systems, previously based on static methods. Despite the undeniable expressive power of these architectures, their recurrent nature complicates their understanding and poses challenges in the training procedures. Recently, new important families of recurrent architectures have emerged and their applicability in the context of load forecasting has not been investigated completely yet. In this paper we perform a comparative study on the problem of Short-Term Load Forecast, by using different classes of state-of-the-art Recurrent Neural Networks. We test the reviewed models first on controlled synthetic tasks and then on different real datasets, covering important practical cases of study. We provide a general overview of the most important architectures and we define guidelines for configuring the recurrent networks to predict real-valued time series.

* Springer Briefs in Computer Science (ISBN 978-3-319-70338-1), 2017

Via

Access Paper or Ask Questions

Uncertainty and Interpretability in Convolutional Neural Networks for Semantic Segmentation of Colorectal Polyps

Jul 16, 2018
Kristoffer Wickstrøm, Michael Kampffmeyer, Robert Jenssen

Figure 1 for Uncertainty and Interpretability in Convolutional Neural Networks for Semantic Segmentation of Colorectal Polyps

Figure 2 for Uncertainty and Interpretability in Convolutional Neural Networks for Semantic Segmentation of Colorectal Polyps

Figure 3 for Uncertainty and Interpretability in Convolutional Neural Networks for Semantic Segmentation of Colorectal Polyps

Figure 4 for Uncertainty and Interpretability in Convolutional Neural Networks for Semantic Segmentation of Colorectal Polyps

Convolutional Neural Networks (CNNs) are propelling advances in a range of different computer vision tasks such as object detection and object segmentation. Their success has motivated research in applications of such models for medical image analysis. If CNN-based models are to be helpful in a medical context, they need to be precise, interpretable, and uncertainty in predictions must be well understood. In this paper, we develop and evaluate recent advances in uncertainty estimation and model interpretability in the context of semantic segmentation of polyps from colonoscopy images. We evaluate and enhance several architectures of Fully Convolutional Networks (FCNs) for semantic segmentation of colorectal polyps and provide a comparison between these models. Our highest performing model achieves a 76.06\% mean IOU accuracy on the EndoScene dataset, a considerable improvement over the previous state-of-the-art.

* To appear in IEEE MLSP 2018

Via

Access Paper or Ask Questions

Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder

Jun 07, 2018
Rogelio Andrade Mancisidor, Michael Kampffmeyer, Kjersti Aas, Robert Jenssen

Figure 1 for Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder

Figure 2 for Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder

Figure 3 for Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder

Figure 4 for Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder

Identifying customer segments in retail banking portfolios with different risk profiles can improve the accuracy of credit scoring. The Variational Autoencoder (VAE) has shown promising results in different research domains, and it has been documented the powerful information embedded in the latent space of the VAE. We use the VAE and show that transforming the input data into a meaningful representation, it is possible to steer configurations in the latent space of the VAE. Specifically, the Weight of Evidence (WoE) transformation encapsulates the propensity to fall into financial distress and the latent space in the VAE preserves this characteristic in a well-defined clustering structure. These clusters have considerably different risk profiles and therefore are suitable not only for credit scoring but also for marketing and customer purposes. This new clustering methodology offers solutions to some of the challenges in the existing clustering algorithms, e.g., suggests the number of clusters, assigns cluster labels to new customers, enables cluster visualization, scales to large datasets, captures non-linear relationships among others. Finally, for portfolios with a large number of customers in each cluster, developing one classifier model per cluster can improve the credit scoring assessment.

Via

Access Paper or Ask Questions

Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders

May 09, 2018
Filippo Maria Bianchi, Lorenzo Livi, Karl Øyvind Mikalsen, Michael Kampffmeyer, Robert Jenssen

Figure 1 for Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders

Figure 2 for Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders

Figure 3 for Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders

Figure 4 for Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders

Learning compressed representations of multivariate time series (MTS) facilitate the analysis and process of the data in presence of noise, redundant information, and large amount of variables and time steps. However, classic dimensionality reduction approaches are not designed to process sequential data, especially in the presence of missing values. In this work, we propose a novel autoencoder architecture based on recurrent neural networks to generate compressed representations of MTS, which may contain missing values and have variable lengths. Our autoencoder learns fixed-length vectorial representations, whose pairwise similarities are aligned with a kernel function that operates in input space and handles missing values. This, allows to preserve relationships in the low-dimensional vector space even in presence of missing values. To highlight the main features of the proposed autoencoder, we first investigate its performance in controlled experiments. Successively, we show how the learned representations can be exploited both in several benchmark and real-world classification tasks on medical data. Finally, based on the proposed architecture, we conceive a framework for one-class classification and imputation of missing data in time series extracted from ECG signals.

Via

Access Paper or Ask Questions