Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox


TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection

Feb 20, 2018
Tirthankar Ghosal, Amitra Salam, Swati Tiwari, Asif Ekbal, Pushpak Bhattacharyya



Detecting novelty of an entire document is an Artificial Intelligence (AI) frontier problem that has widespread NLP applications, such as extractive document summarization, tracking development of news events, predicting impact of scholarly articles, etc. Important though the problem is, we are unaware of any benchmark document level data that correctly addresses the evaluation of automatic novelty detection techniques in a classification framework. To bridge this gap, we present here a resource for benchmarking the techniques for document level novelty detection. We create the resource via event-specific crawling of news documents across several domains in a periodic manner. We release the annotated corpus with necessary statistics and show its use with a developed system for the problem in concern.

* Accepted for publication in Language Resources and Evaluation Conference (LREC) 2018 


Share this with someone who'll enjoy it:

   Access Paper Source



Share this with someone who'll enjoy it: