Alert button

A Neural Corpus Indexer for Document Retrieval

Jun 06, 2022
Figure 1 for A Neural Corpus Indexer for Document Retrieval
Figure 2 for A Neural Corpus Indexer for Document Retrieval
Figure 3 for A Neural Corpus Indexer for Document Retrieval
Figure 4 for A Neural Corpus Indexer for Document Retrieval

Share this with someone who'll enjoy it:

Current state-of-the-art document retrieval solutions mainly follow an index-retrieve paradigm, where the index is hard to be optimized for the final retrieval target. In this paper, we aim to show that an end-to-end deep neural network unifying training and indexing stages can significantly improve the recall performance of traditional methods. To this end, we propose Neural Corpus Indexer (NCI), a sequence-to-sequence network that generates relevant document identifiers directly for a designated query. To optimize the recall performance of NCI, we invent a prefix-aware weight-adaptive decoder architecture, and leverage tailored techniques including query generation, semantic document identifiers and consistency-based regularization. Empirical studies demonstrated the superiority of NCI on a commonly used academic benchmark, achieving +51.9% relative improvement on NQ320k dataset compared to the best baseline.

* 18 pages, 4 figures  

Share this with someone who'll enjoy it: