The Weyl transform is introduced as a rich framework for data representation. Transform coefficients are connected to the Walsh-Hadamard transform of multiscale autocorrelations, and different forms of dyadic periodicity in a signal are shown to appear as different features in its Weyl coefficients. The Weyl transform has a high degree of symmetry with respect to a large group of multiscale transformations, which allows compact yet discriminative representations to be obtained by pooling coefficients. The effectiveness of the Weyl transform is demonstrated through the example of textured image classification.
Subspace models play an important role in a wide range of signal processing tasks, and this paper explores how the pairwise geometry of subspaces influences the probability of misclassification. When the mismatch between the signal and the model is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. The transform presented here (TRAIT) preserves some specific characteristic of each individual class, and this approach is shown to be complementary to a previously developed transform (LRT) that enlarges inter-class distance while suppressing intra-class dispersion. Theoretical results are supported by demonstration of superior classification accuracy on synthetic and measured data even in the presence of significant model mismatch.
In this paper, computer-based techniques for stylistic analysis of paintings are applied to the five panels of the 14th century Peruzzi Altarpiece by Giotto di Bondone. Features are extracted by combining a dual-tree complex wavelet transform with a hidden Markov tree (HMT) model. Hierarchical clustering is used to identify stylistic keywords in image patches, and keyword frequencies are calculated for sub-images that each contains many patches. A generative hierarchical Bayesian model learns stylistic patterns of keywords; these patterns are then used to characterize the styles of the sub-images; this in turn, permits to discriminate between paintings. Results suggest that such unsupervised probabilistic topic models can be useful to distill characteristic elements of style.
We study the problem of supervised linear dimensionality reduction, taking an information-theoretic viewpoint. The linear projection matrix is designed by maximizing the mutual information between the projected signal and the class label (based on a Shannon entropy measure). By harnessing a recent theoretical result on the gradient of mutual information, the above optimization problem can be solved directly using gradient descent, without requiring simplification of the objective function. Theoretical analysis and empirical comparison are made between the proposed method and two closely related methods (Linear Discriminant Analysis and Information Discriminant Analysis), and comparisons are also made with a method in which Renyi entropy is used to define the mutual information (in this case the gradient may be computed simply, under a special parameter setting). Relative to these alternative approaches, the proposed method achieves promising results on real datasets.
This paper studies non-asymptotic model selection for the general case of arbitrary design matrices and arbitrary nonzero entries of the signal. In this regard, it generalizes the notion of incoherence in the existing literature on model selection and introduces two fundamental measures of coherence---termed as the worst-case coherence and the average coherence---among the columns of a design matrix. It utilizes these two measures of coherence to provide an in-depth analysis of a simple, model-order agnostic one-step thresholding (OST) algorithm for model selection and proves that OST is feasible for exact as well as partial model selection as long as the design matrix obeys an easily verifiable property. One of the key insights offered by the ensuing analysis in this regard is that OST can successfully carry out model selection even when methods based on convex optimization such as the lasso fail due to the rank deficiency of the submatrices of the design matrix. In addition, the paper establishes that if the design matrix has reasonably small worst-case and average coherence then OST performs near-optimally when either (i) the energy of any nonzero entry of the signal is close to the average signal energy per nonzero entry or (ii) the signal-to-noise ratio in the measurement system is not too high. Finally, two other key contributions of the paper are that (i) it provides bounds on the average coherence of Gaussian matrices and Gabor frames, and (ii) it extends the results on model selection using OST to low-complexity, model-order agnostic recovery of sparse signals with arbitrary nonzero entries.
The ability to detect weak distributed activation patterns in networks is critical to several applications, such as identifying the onset of anomalous activity or incipient congestion in the Internet, or faint traces of a biochemical spread by a sensor network. This is a challenging problem since weak distributed patterns can be invisible in per node statistics as well as a global network-wide aggregate. Most prior work considers situations in which the activation/non-activation of each node is statistically independent, but this is unrealistic in many problems. In this paper, we consider structured patterns arising from statistical dependencies in the activation process. Our contributions are three-fold. First, we propose a sparsifying transform that succinctly represents structured activation patterns that conform to a hierarchical dependency graph. Second, we establish that the proposed transform facilitates detection of very weak activation patterns that cannot be detected with existing methods. Third, we show that the structure of the hierarchical dependency graph governing the activation process, and hence the network transform, can be learnt from very few (logarithmic in network size) independent snapshots of network activity.