Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Richard Baraniuk

Interpretable Image Clustering via Diffeomorphism-Aware K-Means

Dec 16, 2020
Romain Cosentino, Randall Balestriero, Yanis Bahroun, Anirvan Sengupta, Richard Baraniuk, Behnaam Aazhang

Figure 1 for Interpretable Image Clustering via Diffeomorphism-Aware K-Means

Figure 2 for Interpretable Image Clustering via Diffeomorphism-Aware K-Means

Figure 3 for Interpretable Image Clustering via Diffeomorphism-Aware K-Means

Figure 4 for Interpretable Image Clustering via Diffeomorphism-Aware K-Means

We design an interpretable clustering algorithm aware of the nonlinear structure of image manifolds. Our approach leverages the interpretability of $K$-means applied in the image space while addressing its clustering performance issues. Specifically, we develop a measure of similarity between images and centroids that encompasses a general class of deformations: diffeomorphisms, rendering the clustering invariant to them. Our work leverages the Thin-Plate Spline interpolation technique to efficiently learn diffeomorphisms best characterizing the image manifolds. Extensive numerical simulations show that our approach competes with state-of-the-art methods on various datasets.

Via

Access Paper or Ask Questions

Scalable Neural Tangent Kernel of Recurrent Architectures

Dec 09, 2020
Sina Alemohammad, Randall Balestriero, Zichao Wang, Richard Baraniuk

Figure 1 for Scalable Neural Tangent Kernel of Recurrent Architectures

Figure 2 for Scalable Neural Tangent Kernel of Recurrent Architectures

Figure 3 for Scalable Neural Tangent Kernel of Recurrent Architectures

Figure 4 for Scalable Neural Tangent Kernel of Recurrent Architectures

Kernels derived from deep neural networks (DNNs) in the infinite-width provide not only high performance in a range of machine learning tasks but also new theoretical insights into DNN training dynamics and generalization. In this paper, we extend the family of kernels associated with recurrent neural networks (RNNs), which were previously derived only for simple RNNs, to more complex architectures that are bidirectional RNNs and RNNs with average pooling. We also develop a fast GPU implementation to exploit its full practical potential. While RNNs are typically only applied to time-series data, we demonstrate that classifiers using RNN-based kernels outperform a range of baseline methods on 90 non-time-series datasets from the UCI data repository.

Via

Access Paper or Ask Questions

Provable Finite Data Generalization with Group Autoencoder

Sep 20, 2020
Romain Cosentino, Randall Balestriero, Richard Baraniuk, Behnaam Aazhang

Figure 1 for Provable Finite Data Generalization with Group Autoencoder

Figure 2 for Provable Finite Data Generalization with Group Autoencoder

Figure 3 for Provable Finite Data Generalization with Group Autoencoder

Figure 4 for Provable Finite Data Generalization with Group Autoencoder

Deep Autoencoders (AEs) provide a versatile framework to learn a compressed, interpretable, or structured representation of data. As such, AEs have been used extensively for denoising, compression, data completion as well as pre-training of Deep Networks (DNs) for various tasks such as classification. By providing a careful analysis of current AEs from a spline perspective, we can interpret the input-output mapping, in turn allowing us to derive conditions for generalization and reconstruction guarantee. By assuming a Lie group structure on the data at hand, we are able to derive a novel regularization of AEs, allowing for the first time to ensure the generalization of AEs in the finite training set case. We validate our theoretical analysis by demonstrating how this regularization significantly increases the generalization of the AE on various datasets.

Via

Access Paper or Ask Questions

The Recurrent Neural Tangent Kernel

Jun 18, 2020
Sina Alemohammad, Zichao Wang, Randall Balestriero, Richard Baraniuk

Figure 1 for The Recurrent Neural Tangent Kernel

Figure 2 for The Recurrent Neural Tangent Kernel

Figure 3 for The Recurrent Neural Tangent Kernel

Figure 4 for The Recurrent Neural Tangent Kernel

The study of deep networks (DNs) in the infinite-width limit, via the so-called Neural Tangent Kernel (NTK) approach, has provided new insights into the dynamics of learning, generalization, and the impact of initialization. One key DN architecture remains to be kernelized, namely, the Recurrent Neural Network (RNN). In this paper we introduce and study the Recurrent Neural Tangent Kernel (RNTK), which sheds new insights into the behavior of overparametrized RNNs, including how different time steps are weighted by the RNTK to form the output under different initialization parameters and nonlinearity choices, and how inputs of different lengths are treated. We demonstrate via a number of experiments that the RNTK offers significant performance gains over other kernels, including standard NTKs across a range of different data sets. A unique benefit of the RNTK is that it is agnostic to the length of the input, in stark contrast to other kernels.

Via

Access Paper or Ask Questions

VarFA: A Variational Factor Analysis Framework For Efficient Bayesian Learning Analytics

May 27, 2020
Zichao Wang, Yi Gu, Andrew Lan, Richard Baraniuk

Figure 1 for VarFA: A Variational Factor Analysis Framework For Efficient Bayesian Learning Analytics

Figure 2 for VarFA: A Variational Factor Analysis Framework For Efficient Bayesian Learning Analytics

Figure 3 for VarFA: A Variational Factor Analysis Framework For Efficient Bayesian Learning Analytics

Figure 4 for VarFA: A Variational Factor Analysis Framework For Efficient Bayesian Learning Analytics

We propose VarFA, a variational inference factor analysis framework that extends existing factor analysis models for educational data mining to efficiently output uncertainty estimation in the model's estimated factors. Such uncertainty information is useful, for example, for an adaptive testing scenario, where additional tests can be administered if the model is not quite certain about a students' skill level estimation. Traditional Bayesian inference methods that produce such uncertainty information are computationally expensive and do not scale to large data sets. VarFA utilizes variational inference which makes it possible to efficiently perform Bayesian inference even on very large data sets. We use the sparse factor analysis model as a case study and demonstrate the efficacy of VarFA on both synthetic and real data sets. VarFA is also very general and can be applied to a wide array of factor analysis models.

Via

Access Paper or Ask Questions

Max-Affine Spline Insights into Deep Generative Networks

Feb 26, 2020
Randall Balestriero, Sebastien Paris, Richard Baraniuk

Figure 1 for Max-Affine Spline Insights into Deep Generative Networks

Figure 2 for Max-Affine Spline Insights into Deep Generative Networks

Figure 3 for Max-Affine Spline Insights into Deep Generative Networks

Figure 4 for Max-Affine Spline Insights into Deep Generative Networks

We connect a large class of Generative Deep Networks (GDNs) with spline operators in order to derive their properties, limitations, and new opportunities. By characterizing the latent space partition, dimension and angularity of the generated manifold, we relate the manifold dimension and approximation error to the sample size. The manifold-per-region affine subspace defines a local coordinate basis; we provide necessary and sufficient conditions relating those basis vectors with disentanglement. We also derive the output probability density mapped onto the generated manifold in terms of the latent space density, which enables the computation of key statistics such as its Shannon entropy. This finding also enables the computation of the GDN likelihood, which provides a new mechanism for model comparison as well as providing a quality measure for (generated) samples under the learned distribution. We demonstrate how low entropy and/or multimodal distributions are not naturally modeled by DGNs and are a cause of training instabilities.

Via

Access Paper or Ask Questions

Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference

Jul 17, 2019
Yue Wang, Jianghao Shen, Ting-Kuei Hu, Pengfei Xu, Tan Nguyen, Richard Baraniuk, Zhangyang Wang, Yingyan Lin

Figure 1 for Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference

Figure 2 for Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference

Figure 3 for Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference

Figure 4 for Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference

State-of-the-art convolutional neural networks (CNNs) yield record-breaking predictive performance, yet at the cost of high-energy-consumption inference, that prohibits their widely deployments in resource-constrained Internet of Things (IoT) applications. We propose a dual dynamic inference (DDI) framework that highlights the following aspects: 1) we integrate both input-dependent and resource-dependent dynamic inference mechanisms under a unified framework in order to fit the varying IoT resource requirements in practice. DDI is able to both constantly suppress unnecessary costs for easy samples, and to halt inference for all samples to meet hard resource constraints enforced; 2) we propose a flexible multi-grained learning to skip (MGL2S) approach for input-dependent inference which allows simultaneous layer-wise and channel-wise skipping; 3) we extend DDI to complex CNN backbones such as DenseNet and show that DDI can be applied towards optimizing any specific resource goals including inference latency or energy cost. Extensive experiments demonstrate the superior inference accuracy-resource trade-off achieved by DDI, as well as the flexibility to control such trade-offs compared to existing peer methods. Specifically, DDI can achieve up to 4 times computational savings with the same or even higher accuracy as compared to existing competitive baselines.

Via

Access Paper or Ask Questions

A Hessian Based Complexity Measure for Deep Networks

May 30, 2019
Hamid Javadi, Randall Balestriero, Richard Baraniuk

Figure 1 for A Hessian Based Complexity Measure for Deep Networks

Figure 2 for A Hessian Based Complexity Measure for Deep Networks

Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the overparameterized regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and role of (explicit or implicit) regularization in these networks is of great importance. Inspired by the seminal work of Donoho and Grimes in manifold learning, we develop a new measure for the complexity of the function generated by a deep network based on the integral of the norm of the tangent Hessian. This complexity measure can be used to quantify the irregularity of the function a deep network fits to training data or as a regularization penalty for deep network learning. Indeed, we show that the oft-used heuristic of data augmentation imposes an implicit Hessian regularization during learning. We demonstrate the utility of our new complexity measure through a range of learning experiments.

* 13 pages, 8 figures

Via

Access Paper or Ask Questions