Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kevin Gimpel

Shammie

From Paraphrase Database to Compositional Paraphrase Model and Back

Aug 26, 2015

John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu, Dan Roth

Abstract:The Paraphrase Database (PPDB; Ganitkevitch et al., 2013) is an extensive semantic resource, consisting of a list of phrase pairs with (heuristic) confidence estimates. However, it is still unclear how it can best be used, due to the heuristic nature of the confidences and its necessarily incomplete coverage. We propose models to leverage the phrase pairs from the PPDB to build parametric paraphrase models that score paraphrase pairs more accurately than the PPDB's internal scores while simultaneously improving its coverage. They allow for learning phrase embeddings as well as improved word embeddings. Moreover, we introduce two new, manually annotated datasets to evaluate short-phrase paraphrasing models. Using our paraphrase model trained using PPDB, we achieve state-of-the-art results on standard word and bigram similarity tasks and beat strong baselines on our new short phrase paraphrase tasks.

* TACL Vol 3 (2015) pg 345-358
* 2015 TACL paper updated with an appendix describing new 300 dimensional embeddings. Submitted 1/2015. Accepted 2/2015. Published 6/2015

Via

Access Paper or Ask Questions

Predicting the NFL using Twitter

Oct 25, 2013

Shiladitya Sinha, Chris Dyer, Kevin Gimpel, Noah A. Smith

Figure 1 for Predicting the NFL using Twitter

Figure 2 for Predicting the NFL using Twitter

Figure 3 for Predicting the NFL using Twitter

Figure 4 for Predicting the NFL using Twitter

Abstract:We study the relationship between social media output and National Football League (NFL) games, using a dataset containing messages from Twitter and NFL game statistics. Specifically, we consider tweets pertaining to specific teams and games in the NFL season and use them alongside statistical game data to build predictive models for future game outcomes (which team will win?) and sports betting outcomes (which team will win with the point spread? will the total points be over/under the line?). We experiment with several feature sets and find that simple features using large volumes of tweets can match or exceed the performance of more traditional features that use game statistics.

* Presented at ECML/PKDD 2013 Workshop on Machine Learning and Data Mining for Sports Analytics

Via

Access Paper or Ask Questions