Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:MIX'EM: Unsupervised Image Classification using a Mixture of Embeddings

Jul 18, 2020

Ali Varamesh, Tinne Tuytelaars

Figure 1 for MIX'EM: Unsupervised Image Classification using a Mixture of Embeddings

Figure 2 for MIX'EM: Unsupervised Image Classification using a Mixture of Embeddings

Figure 3 for MIX'EM: Unsupervised Image Classification using a Mixture of Embeddings

Figure 4 for MIX'EM: Unsupervised Image Classification using a Mixture of Embeddings

Share this with someone who'll enjoy it:

Abstract:We present MIX'EM, a novel solution for unsupervised image classification. Our model generates representations that by themselves are sufficient to drive a general-purpose clustering method to deliver high-quality classification without supervision. MIX'EM integrates an internal mixture of embeddings module into the contrastive visual representation learning framework to disentangle the representation space at the category level. It generates a set of embeddings from a visual representation and mixes them to construct the contrastive loss input. Parallel to the contrastive loss, we introduce three techniques to train MIX'EM and avoid a degenerate solution; (i) we maximize entropy across mixture components to diversify them, and (ii) minimize component entropy conditioned on instances to enforce a clustered embedding space. Applying (i) and (ii) lead to the emergence of semantic categories through the mixture coefficients, making it possible to (iii) apply an associative embedding loss to enforce semantic separability directly. Subsequently, we run K-means on the representations to acquire semantic classification, which outperforms the state-of-the-art by a large margin. We conduct extensive experiments and analyses on STL10, CIFAR10, and CIFAR100-20 datasets, achieving 78\%, 82\%, and 44\% accuracy, respectively. Essential to robust high accuracy is using MIX'EM to initialize K-means. Finally, we report impressively high accuracy baselines (70\% on STL10) achieved solely by applying K-means to the "normalized" representations learned using the contrastive loss.

View paper on

Share this with someone who'll enjoy it:

Title:MIX'EM: Unsupervised Image Classification using a Mixture of Embeddings

Paper and Code