Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Streaming Word Embeddings with the Space-Saving Algorithm

Apr 24, 2017
Chandler May, Kevin Duh, Benjamin Van Durme, Ashwin Lall

Share this with someone who'll enjoy it:

We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec. We compare our streaming algorithm to word2vec empirically by measuring the cosine similarity between word pairs under each algorithm and by applying each algorithm in the downstream task of hashtag prediction on a two-month interval of the Twitter sample stream. We then discuss the results of these experiments, concluding they provide partial validation of our approach as a streaming replacement for word2vec. Finally, we discuss potential failure modes and suggest directions for future work.

* 16 pages 

   Access Paper Source

Share this with someone who'll enjoy it: