Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed

Semi-Supervised Class Discovery

Feb 10, 2020
Jeremy Nixon, Jeremiah Liu, David Berthelot

One promising approach to dealing with datapoints that are outside of the initial training distribution (OOD) is to create new classes that capture similarities in the datapoints previously rejected as uncategorizable. Systems that generate labels can be deployed against an arbitrary amount of data, discovering classification schemes that through training create a higher quality representation of data. We introduce the Dataset Reconstruction Accuracy, a new and important measure of the effectiveness of a model's ability to create labels. We introduce benchmarks against this Dataset Reconstruction metric. We apply a new heuristic, class learnability, for deciding whether a class is worthy of addition to the training dataset. We show that our method applies to language through the CLINC Out-of-scope dataset. And we present a class discovery system that given only half of the classes at train time achieves 91\% reconstruction accuracy on MNIST, 73\% reconstruction accuracy on CIFAR-10 and 87\% reconstruction accuracy on Fashion-MNIST, demonstrating the value of semi-supervised learning to automatically discovering classes.

Share this with someone who'll enjoy it:

   Access Paper Source

Share this with someone who'll enjoy it: