Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Better Summarization Evaluation with Word Embeddings for ROUGE

Aug 25, 2015
Jun-Ping Ng, Viktoria Abrecht

ROUGE is a widely adopted, automatic evaluation measure for text summarization. While it has been shown to correlate well with human judgements, it is biased towards surface lexical similarities. This makes it unsuitable for the evaluation of abstractive summarization, or summaries with substantial paraphrasing. We study the effectiveness of word embeddings to overcome this disadvantage of ROUGE. Specifically, instead of measuring lexical overlaps, word embeddings are used to compute the semantic similarity of the words used in summaries instead. Our experimental results show that our proposal is able to achieve better correlations with human judgements when measured with the Spearman and Kendall rank coefficients.

* Pre-print - To appear in proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 

Share this with someone who'll enjoy it:

   Access Paper Source

Share this with someone who'll enjoy it: