Alert button
Picture for Michael Rosen

Michael Rosen

Alert button

A model selection approach for clustering a multinomial sequence with non-negative factorization

Aug 14, 2015
Nam H. Lee, Runze Tang, Carey E. Priebe, Michael Rosen

Figure 1 for A model selection approach for clustering a multinomial sequence with non-negative factorization
Figure 2 for A model selection approach for clustering a multinomial sequence with non-negative factorization
Figure 3 for A model selection approach for clustering a multinomial sequence with non-negative factorization
Figure 4 for A model selection approach for clustering a multinomial sequence with non-negative factorization

We consider a problem of clustering a sequence of multinomial observations by way of a model selection criterion. We propose a form of a penalty term for the model selection procedure. Our approach subsumes both the conventional AIC and BIC criteria but also extends the conventional criteria in a way that it can be applicable also to a sequence of sparse multinomial observations, where even within a same cluster, the number of multinomial trials may be different for different observations. In addition, as a preliminary estimation step to maximum likelihood estimation, and more generally, to maximum $L_{q}$ estimation, we propose to use reduced rank projection in combination with non-negative factorization. We motivate our approach by showing that our model selection criterion and preliminary estimation step yield consistent estimates under simplifying assumptions. We also illustrate our approach through numerical experiments using real and simulated data.

Viaarxiv icon

Techniques for clustering interaction data as a collection of graphs

Jan 10, 2015
Nam H. Lee, Carey Priebe, Youngser Park, I-Jeng Wang, Michael Rosen

Figure 1 for Techniques for clustering interaction data as a collection of graphs
Figure 2 for Techniques for clustering interaction data as a collection of graphs
Figure 3 for Techniques for clustering interaction data as a collection of graphs
Figure 4 for Techniques for clustering interaction data as a collection of graphs

A natural approach to analyze interaction data of form "what-connects-to-what-when" is to create a time-series (or rather a sequence) of graphs through temporal discretization (bandwidth selection) and spatial discretization (vertex contraction). Such discretization together with non-negative factorization techniques can be useful for obtaining clustering of graphs. Motivating application of performing clustering of graphs (as opposed to vertex clustering) can be found in neuroscience and in social network analysis, and it can also be used to enhance community detection (i.e., vertex clustering) by way of conditioning on the cluster labels. In this paper, we formulate a problem of clustering of graphs as a model selection problem. Our approach involves information criteria, non-negative matrix factorization and singular value thresholding, and we illustrate our techniques using real and simulated data.

Viaarxiv icon

Automatic Dimension Selection for a Non-negative Factorization Approach to Clustering Multiple Random Graphs

Sep 09, 2014
Nam H. Lee, I-Jeng Wang, Youngser Park, Care E. Priebe, Michael Rosen

Figure 1 for Automatic Dimension Selection for a Non-negative Factorization Approach to Clustering Multiple Random Graphs
Figure 2 for Automatic Dimension Selection for a Non-negative Factorization Approach to Clustering Multiple Random Graphs

We consider a problem of grouping multiple graphs into several clusters using singular value thesholding and non-negative factorization. We derive a model selection information criterion to estimate the number of clusters. We demonstrate our approach using "Swimmer data set" as well as simulated data set, and compare its performance with two standard clustering algorithms.

* This paper has been withdrawn by the author due to a newer version with overlapping contents 
Viaarxiv icon