Abstract:Generative AI (genAI) technologies -- specifically, large language models (LLMs) -- and search have evolving relations. We argue for a novel perspective: using genAI to enrich a document corpus so as to improve query-based retrieval effectiveness. The enrichment is based on modifying existing documents or generating new ones. As an empirical proof of concept, we use LLMs to generate documents relevant to a topic which are more retrievable than existing ones. In addition, we demonstrate the potential merits of using corpus enrichment for retrieval augmented generation (RAG) and answer attribution in question answering.
Abstract:We study prompting-based approaches with Large Language Models (LLMs) for modifying documents so as to promote their ranking in a competitive search setting. Our methods are inspired by prior work on leveraging LLMs as rankers. We evaluate our approach by deploying it as a bot in previous ranking competitions and in competitions we organized. Our findings demonstrate that our approach effectively improves document ranking while preserving high levels of faithfulness to the original content and maintaining overall document quality.
Abstract:In Web retrieval, there are many cases of competition between authors of Web documents: their incentive is to have their documents highly ranked for queries of interest. As such, the Web is a prominent example of a competitive search setting. Past work on competitive search focused on ranking functions based solely on relevance estimation. We study ranking functions that integrate a results-diversification aspect. We show that the competitive search setting with diversity-based ranking has an equilibrium. Furthermore, we theoretically and empirically show that the phenomenon of authors mimicking content in documents highly ranked in the past, which was demonstrated in previous work, is mitigated when search results diversification is applied.