Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stratis Ioannidis

Technicolor

Iterative Spectral Method for Alternative Clustering

Sep 08, 2019

Chieh Wu, Stratis Ioannidis, Mario Sznaier, Xiangyu Li, David Kaeli, Jennifer G. Dy

Figure 1 for Iterative Spectral Method for Alternative Clustering

Figure 2 for Iterative Spectral Method for Alternative Clustering

Figure 3 for Iterative Spectral Method for Alternative Clustering

Figure 4 for Iterative Spectral Method for Alternative Clustering

Abstract:Given a dataset and an existing clustering as input, alternative clustering aims to find an alternative partition. One of the state-of-the-art approaches is Kernel Dimension Alternative Clustering (KDAC). We propose a novel Iterative Spectral Method (ISM) that greatly improves the scalability of KDAC. Our algorithm is intuitive, relies on easily implementable spectral decompositions, and comes with theoretical guarantees. Its computation time improves upon existing implementations of KDAC by as much as 5 orders of magnitude.

Via

Access Paper or Ask Questions

Deep Kernel Learning for Clustering

Aug 09, 2019

Chieh Wu, Zulqarnain Khan, Yale Chang, Stratis Ioannidis, Jennifer Dy

Figure 1 for Deep Kernel Learning for Clustering

Figure 2 for Deep Kernel Learning for Clustering

Figure 3 for Deep Kernel Learning for Clustering

Figure 4 for Deep Kernel Learning for Clustering

Abstract:We propose a deep learning approach for discovering kernels tailored to identifying clusters over sample data. Our neural network produces sample embeddings that are motivated by--and are at least as expressive as--spectral clustering. Our training objective, based on the Hilbert Schmidt Information Criterion, can be optimized via gradient adaptations on the Stiefel manifold, leading to significant acceleration over spectral methods relying on eigendecompositions. Finally, our trained embedding can be directly applied to out-of-sample data. We show experimentally that our approach outperforms several state-of-the-art deep clustering methods, as well as traditional approaches such as $k$-means and spectral clustering over a broad array of real-life and synthetic datasets.

Via

Access Paper or Ask Questions

Accelerated Experimental Design for Pairwise Comparisons

Jan 18, 2019

Yuan Guo, Jennifer Dy, Deniz Erdogmus, Jayashree Kalpathy-Cramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, Stratis Ioannidis

Figure 1 for Accelerated Experimental Design for Pairwise Comparisons

Figure 2 for Accelerated Experimental Design for Pairwise Comparisons

Figure 3 for Accelerated Experimental Design for Pairwise Comparisons

Figure 4 for Accelerated Experimental Design for Pairwise Comparisons

Abstract:Pairwise comparison labels are more informative and less variable than class labels, but generating them poses a challenge: their number grows quadratically in the dataset size. We study a natural experimental design objective, namely, D-optimality, that can be used to identify which $K$ pairwise comparisons to generate. This objective is known to perform well in practice, and is submodular, making the selection approximable via the greedy algorithm. A na\"ive greedy implementation has $O(N^2d^2K)$ complexity, where $N$ is the dataset size, $d$ is the feature space dimension, and $K$ is the number of generated comparisons. We show that, by exploiting the inherent geometry of the dataset--namely, that it consists of pairwise comparisons--the greedy algorithm's complexity can be reduced to $O(N^2(K+d)+N(dK+d^2) +d^2K).$ We apply the same acceleration also to the so-called lazy greedy algorithm. When combined, the above improvements lead to an execution time of less than 1 hour for a dataset with $10^8$ comparisons; the na\"ive greedy algorithm on the same dataset would require more than 10 days to terminate.

Via

Access Paper or Ask Questions

ORACLE: Optimized Radio clAssification through Convolutional neuraL nEtworks

Dec 03, 2018

Kunal Sankhe, Mauro Belgiovine, Fan Zhou, Shamnaz Riyaz, Stratis Ioannidis, Kaushik Chowdhury

Figure 1 for ORACLE: Optimized Radio clAssification through Convolutional neuraL nEtworks

Figure 2 for ORACLE: Optimized Radio clAssification through Convolutional neuraL nEtworks

Figure 3 for ORACLE: Optimized Radio clAssification through Convolutional neuraL nEtworks

Figure 4 for ORACLE: Optimized Radio clAssification through Convolutional neuraL nEtworks

Abstract:This paper describes the architecture and performance of ORACLE, an approach for detecting a unique radio from a large pool of bit-similar devices (same hardware, protocol, physical address, MAC ID) using only IQ samples at the physical layer. ORACLE trains a convolutional neural network (CNN) that balances computational time and accuracy, showing 99\% classification accuracy for a 16-node USRP X310 SDR testbed and an external database of $>$100 COTS WiFi devices. Our work makes the following contributions: (i) it studies the hardware-centric features within the transmitter chain that causes IQ sample variations; (ii) for an idealized static channel environment, it proposes a CNN architecture requiring only raw IQ samples accessible at the front-end, without channel estimation or prior knowledge of the communication protocol; (iii) for dynamic channels, it demonstrates a principled method of feedback-driven transmitter-side modifications that uses channel estimation at the receiver to increase differentiability for the CNN classifier. The key innovation here is to intentionally introduce controlled imperfections on the transmitter side through software directives, while minimizing the change in bit error rate. Unlike previous work that imposes constant environmental conditions, ORACLE adopts the `train once deploy anywhere' paradigm with near-perfect device classification accuracy.

* Accepted in IEEE INFOCOM 2019, Paris, France, May 2019

Via

Access Paper or Ask Questions

Deep feature transfer between localization and segmentation tasks

Nov 10, 2018

Szu-Yeu Hu, Andrew Beers, Ken Chang, Kathi Höbel, J. Peter Campbell, Deniz Erdogumus, Stratis Ioannidis, Jennifer Dy, Michael F. Chiang, Jayashree Kalpathy-Cramer(+1 more)

Figure 1 for Deep feature transfer between localization and segmentation tasks

Figure 2 for Deep feature transfer between localization and segmentation tasks

Abstract:In this paper, we propose a new pre-training scheme for U-net based image segmentation. We first train the encoding arm as a localization network to predict the center of the target, before extending it into a U-net architecture for segmentation. We apply our proposed method to the problem of segmenting the optic disc from fundus photographs. Our work shows that the features learned by encoding arm can be transferred to the segmentation network to reduce the annotation burden. We propose that an approach could have broad utility for medical image segmentation, and alleviate the burden of delineating complex structures by pre-training on annotations that are much easier to acquire.

Via

Access Paper or Ask Questions

Learning Combinations of Sigmoids Through Gradient Estimation

Jan 17, 2018

Stratis Ioannidis, Andrea Montanari

Figure 1 for Learning Combinations of Sigmoids Through Gradient Estimation

Abstract:We develop a new approach to learn the parameters of regression models with hidden variables. In a nutshell, we estimate the gradient of the regression function at a set of random points, and cluster the estimated gradients. The centers of the clusters are used as estimates for the parameters of hidden units. We justify this approach by studying a toy model, whereby the regression function is a linear combination of sigmoids. We prove that indeed the estimated gradients concentrate around the parameter vectors of the hidden units, and provide non-asymptotic bounds on the number of required samples. To the best of our knowledge, no comparable guarantees have been proven for linear combinations of sigmoids.

Via

Access Paper or Ask Questions

Truthful Linear Regression

Jun 10, 2015

Rachel Cummings, Stratis Ioannidis, Katrina Ligett

Abstract:We consider the problem of fitting a linear model to data held by individuals who are concerned about their privacy. Incentivizing most players to truthfully report their data to the analyst constrains our design to mechanisms that provide a privacy guarantee to the participants; we use differential privacy to model individuals' privacy losses. This immediately poses a problem, as differentially private computation of a linear model necessarily produces a biased estimation, and existing approaches to design mechanisms to elicit data from privacy-sensitive individuals do not generalize well to biased estimators. We overcome this challenge through an appropriate design of the computation and payment scheme.

* To appear in Proceedings of the 28th Annual Conference on Learning Theory (COLT 2015)

Via

Access Paper or Ask Questions

Guess Who Rated This Movie: Identifying Users Through Subspace Clustering

Aug 09, 2014

Amy Zhang, Nadia Fawaz, Stratis Ioannidis, Andrea Montanari

Figure 1 for Guess Who Rated This Movie: Identifying Users Through Subspace Clustering

Figure 2 for Guess Who Rated This Movie: Identifying Users Through Subspace Clustering

Figure 3 for Guess Who Rated This Movie: Identifying Users Through Subspace Clustering

Figure 4 for Guess Who Rated This Movie: Identifying Users Through Subspace Clustering

Abstract:It is often the case that, within an online recommender system, multiple users share a common account. Can such shared accounts be identified solely on the basis of the userprovided ratings? Once a shared account is identified, can the different users sharing it be identified as well? Whenever such user identification is feasible, it opens the way to possible improvements in personalized recommendations, but also raises privacy concerns. We develop a model for composite accounts based on unions of linear subspaces, and use subspace clustering for carrying out the identification task. We show that a significant fraction of such accounts is identifiable in a reliable manner, and illustrate potential uses for personalized recommendation.

* Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

Via

Access Paper or Ask Questions

Learning Mixtures of Linear Classifiers

Jul 30, 2014

Yuekai Sun, Stratis Ioannidis, Andrea Montanari

Figure 1 for Learning Mixtures of Linear Classifiers

Figure 2 for Learning Mixtures of Linear Classifiers

Figure 3 for Learning Mixtures of Linear Classifiers

Abstract:We consider a discriminative learning (regression) problem, whereby the regression function is a convex combination of k linear classifiers. Existing approaches are based on the EM algorithm, or similar techniques, without provable guarantees. We develop a simple method based on spectral techniques and a `mirroring' trick, that discovers the subspace spanned by the classifiers' parameter vectors. Under a probabilistic assumption on the feature vector distribution, we prove that this approach has nearly optimal statistical efficiency.

Via

Access Paper or Ask Questions

Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization

Jul 30, 2014

Smriti Bhagat, Udi Weinsberg, Stratis Ioannidis, Nina Taft

Figure 1 for Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization

Figure 2 for Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization

Figure 3 for Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization

Figure 4 for Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization

Abstract:Recommender systems leverage user demographic information, such as age, gender, etc., to personalize recommendations and better place their targeted ads. Oftentimes, users do not volunteer this information due to privacy concerns, or due to a lack of initiative in filling out their online profiles. We illustrate a new threat in which a recommender learns private attributes of users who do not voluntarily disclose them. We design both passive and active attacks that solicit ratings for strategically selected items, and could thus be used by a recommender system to pursue this hidden agenda. Our methods are based on a novel usage of Bayesian matrix factorization in an active learning setting. Evaluations on multiple datasets illustrate that such attacks are indeed feasible and use significantly fewer rated items than static inference methods. Importantly, they succeed without sacrificing the quality of recommendations to users.

* This is the extended version of a paper that appeared in ACM RecSys 2014

Via

Access Paper or Ask Questions