Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Guido Zuccon

Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

Mar 08, 2025

Shengyao Zhuang, Xueguang Ma, Bevan Koopman, Jimmy Lin, Guido Zuccon

Figure 1 for Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

Figure 2 for Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

Figure 3 for Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

Figure 4 for Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

Abstract:In this paper, we introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Existing document reranking methods based on large language models (LLMs) typically rely on prompting or fine-tuning LLMs to order or label candidate documents according to their relevance to a query. For Rank-R1, we use a reinforcement learning algorithm along with only a small set of relevance labels (without any reasoning supervision) to enhance the reasoning ability of LLM-based rerankers. Our hypothesis is that adding reasoning capabilities to the rerankers can improve their relevance assessement and ranking capabilities. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries. In particular, we find that Rank-R1 achieves effectiveness on in-domain datasets at par with that of supervised fine-tuning methods, but utilizing only 18\% of the training data used by the fine-tuning methods. We also find that the model largely outperforms zero-shot and supervised fine-tuning when applied to out-of-domain datasets featuring complex queries, especially when a 14B-size model is used. Finally, we qualitatively observe that Rank-R1's reasoning process improves the explainability of the ranking results, opening new opportunities for search engine results presentation and fruition.

Via

Access Paper or Ask Questions

Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Mar 07, 2025

Linh Le, Guido Zuccon, Gianluca Demartini, Genghong Zhao, Xia Zhang

Figure 1 for Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Figure 2 for Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Figure 3 for Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Figure 4 for Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Abstract:Previous work on clinical relation extraction from free-text sentences leveraged information about semantic types from clinical knowledge bases as a part of entity representations. In this paper, we exploit additional evidence by also making use of domain-specific semantic type dependencies. We encode the relation between a span of tokens matching a Unified Medical Language System (UMLS) concept and other tokens in the sentence. We implement our method and compare against different named entity recognition (NER) architectures (i.e., BiLSTM-CRF and BiLSTM-GCN-CRF) using different pre-trained clinical embeddings (i.e., BERT, BioBERT, UMLSBert). Our experimental results on clinical datasets show that in some cases NER effectiveness can be significantly improved by making use of domain-specific semantic type dependencies. Our work is also the first study generating a matrix encoding to make use of more than three dependencies in one pass for the NER task.

* AMIA - American Medical Informatics Association 2022

Via

Access Paper or Ask Questions

DenseReviewer: A Screening Prioritisation Tool for Systematic Review based on Dense Retrieval

Feb 05, 2025

Xinyu Mao, Teerapong Leelanupab, Harrisen Scells, Guido Zuccon

Abstract:Screening is a time-consuming and labour-intensive yet required task for medical systematic reviews, as tens of thousands of studies often need to be screened. Prioritising relevant studies to be screened allows downstream systematic review creation tasks to start earlier and save time. In previous work, we developed a dense retrieval method to prioritise relevant studies with reviewer feedback during the title and abstract screening stage. Our method outperforms previous active learning methods in both effectiveness and efficiency. In this demo, we extend this prior work by creating (1) a web-based screening tool that enables end-users to screen studies exploiting state-of-the-art methods and (2) a Python library that integrates models and feedback mechanisms and allows researchers to develop and demonstrate new active learning methods. We describe the tool's design and showcase how it can aid screening. The tool is available at https://densereviewer.ielab.io. The source code is also open sourced at https://github.com/ielab/densereviewer.

* Accepted at ECIR 2025

Via

Access Paper or Ask Questions

Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks

Jan 28, 2025

Shengyao Zhuang, Ekaterina Khramtsova, Xueguang Ma, Bevan Koopman, Jimmy Lin, Guido Zuccon

Figure 1 for Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks

Figure 2 for Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks

Figure 3 for Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks

Figure 4 for Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks

Abstract:Recent advancements in dense retrieval have introduced vision-language model (VLM)-based retrievers, such as DSE and ColPali, which leverage document screenshots embedded as vectors to enable effective search and offer a simplified pipeline over traditional text-only methods. In this study, we propose three pixel poisoning attack methods designed to compromise VLM-based retrievers and evaluate their effectiveness under various attack settings and parameter configurations. Our empirical results demonstrate that injecting even a single adversarial screenshot into the retrieval corpus can significantly disrupt search results, poisoning the top-10 retrieved documents for 41.9% of queries in the case of DSE and 26.4% for ColPali. These vulnerability rates notably exceed those observed with equivalent attacks on text-only retrievers. Moreover, when targeting a small set of known queries, the attack success rate raises, achieving complete success in certain cases. By exposing the vulnerabilities inherent in vision-language models, this work highlights the potential risks associated with their deployment.

Via

Access Paper or Ask Questions

VISA: Retrieval Augmented Generation with Visual Source Attribution

Dec 19, 2024

Xueguang Ma, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Wenhu Chen, Jimmy Lin

Figure 1 for VISA: Retrieval Augmented Generation with Visual Source Attribution

Figure 2 for VISA: Retrieval Augmented Generation with Visual Source Attribution

Figure 3 for VISA: Retrieval Augmented Generation with Visual Source Attribution

Figure 4 for VISA: Retrieval Augmented Generation with Visual Source Attribution

Abstract:Generation with source attribution is important for enhancing the verifiability of retrieval-augmented generation (RAG) systems. However, existing approaches in RAG primarily link generated content to document-level references, making it challenging for users to locate evidence among multiple content-rich retrieved documents. To address this challenge, we propose Retrieval-Augmented Generation with Visual Source Attribution (VISA), a novel approach that combines answer generation with visual source attribution. Leveraging large vision-language models (VLMs), VISA identifies the evidence and highlights the exact regions that support the generated answers with bounding boxes in the retrieved document screenshots. To evaluate its effectiveness, we curated two datasets: Wiki-VISA, based on crawled Wikipedia webpage screenshots, and Paper-VISA, derived from PubLayNet and tailored to the medical domain. Experimental results demonstrate the effectiveness of VISA for visual source attribution on documents' original look, as well as highlighting the challenges for improvement. Code, data, and model checkpoints will be released.

Via

Access Paper or Ask Questions

2D Matryoshka Training for Information Retrieval

Nov 26, 2024

Shuai Wang, Shengyao Zhuang, Bevan Koopman, Guido Zuccon

Figure 1 for 2D Matryoshka Training for Information Retrieval

Figure 2 for 2D Matryoshka Training for Information Retrieval

Figure 3 for 2D Matryoshka Training for Information Retrieval

Figure 4 for 2D Matryoshka Training for Information Retrieval

Abstract:2D Matryoshka Training is an advanced embedding representation training approach designed to train an encoder model simultaneously across various layer-dimension setups. This method has demonstrated higher effectiveness in Semantic Text Similarity (STS) tasks over traditional training approaches when using sub-layers for embeddings. Despite its success, discrepancies exist between two published implementations, leading to varied comparative results with baseline models. In this reproducibility study, we implement and evaluate both versions of 2D Matryoshka Training on STS tasks and extend our analysis to retrieval tasks. Our findings indicate that while both versions achieve higher effectiveness than traditional Matryoshka training on sub-dimensions, and traditional full-sized model training approaches, they do not outperform models trained separately on specific sub-layer and sub-dimension setups. Moreover, these results generalize well to retrieval tasks, both in supervised (MSMARCO) and zero-shot (BEIR) settings. Further explorations of different loss computations reveals more suitable implementations for retrieval tasks, such as incorporating full-dimension loss and training on a broader range of target dimensions. Conversely, some intuitive approaches, such as fixing document encoders to full model outputs, do not yield improvements. Our reproduction code is available at https://github.com/ielab/2DMSE-Reproduce.

Via

Access Paper or Ask Questions

Starbucks: Improved Training for 2D Matryoshka Embeddings

Oct 17, 2024

Shengyao Zhuang, Shuai Wang, Bevan Koopman, Guido Zuccon

Figure 1 for Starbucks: Improved Training for 2D Matryoshka Embeddings

Figure 2 for Starbucks: Improved Training for 2D Matryoshka Embeddings

Figure 3 for Starbucks: Improved Training for 2D Matryoshka Embeddings

Figure 4 for Starbucks: Improved Training for 2D Matryoshka Embeddings

Abstract:Effective approaches that can scale embedding model depth (i.e. layers) and embedding size allow for the creation of models that are highly scalable across different computational resources and task requirements. While the recently proposed 2D Matryoshka training approach can efficiently produce a single embedding model such that its sub-layers and sub-dimensions can measure text similarity, its effectiveness is significantly worse than if smaller models were trained separately. To address this issue, we propose Starbucks, a new training strategy for Matryoshka-like embedding models, which encompasses both the fine-tuning and pre-training phases. For the fine-tuning phase, we discover that, rather than sampling a random sub-layer and sub-dimensions for each training steps, providing a fixed list of layer-dimension pairs, from small size to large sizes, and computing the loss across all pairs significantly improves the effectiveness of 2D Matryoshka embedding models, bringing them on par with their separately trained counterparts. To further enhance performance, we introduce a new pre-training strategy, which applies masked autoencoder language modelling to sub-layers and sub-dimensions during pre-training, resulting in a stronger backbone for subsequent fine-tuning of the embedding model. Experimental results on both semantic text similarity and retrieval benchmarks demonstrate that the proposed pre-training and fine-tuning strategies significantly improved the effectiveness over 2D Matryoshka models, enabling Starbucks models to perform more efficiently and effectively than separately trained models.

Via

Access Paper or Ask Questions

Does Vec2Text Pose a New Corpus Poisoning Threat?

Oct 09, 2024

Shengyao Zhuang, Bevan Koopman, Guido Zuccon

Figure 1 for Does Vec2Text Pose a New Corpus Poisoning Threat?

Figure 2 for Does Vec2Text Pose a New Corpus Poisoning Threat?

Abstract:The emergence of Vec2Text -- a method for text embedding inversion -- has raised serious privacy concerns for dense retrieval systems which use text embeddings. This threat comes from the ability for an attacker with access to embeddings to reconstruct the original text. In this paper, we take a new look at Vec2Text and investigate how much of a threat it poses to the different attacks of corpus poisoning, whereby an attacker injects adversarial passages into a retrieval corpus with the intention of misleading dense retrievers. Theoretically, Vec2Text is far more dangerous than previous attack methods because it does not need access to the embedding model's weights and it can efficiently generate many adversarial passages. We show that under certain conditions, corpus poisoning with Vec2Text can pose a serious threat to dense retriever system integrity and user experience by injecting adversarial passaged into top ranked positions. Code and data are made available at https://github.com/ielab/vec2text-corpus-poisoning

* arXiv admin note: substantial text overlap with arXiv:2402.12784

Via

Access Paper or Ask Questions

Source-Free Domain-Invariant Performance Prediction

Aug 06, 2024

Ekaterina Khramtsova, Mahsa Baktashmotlagh, Guido Zuccon, Xi Wang, Mathieu Salzmann

Figure 1 for Source-Free Domain-Invariant Performance Prediction

Figure 2 for Source-Free Domain-Invariant Performance Prediction

Figure 3 for Source-Free Domain-Invariant Performance Prediction

Figure 4 for Source-Free Domain-Invariant Performance Prediction

Abstract:Accurately estimating model performance poses a significant challenge, particularly in scenarios where the source and target domains follow different data distributions. Most existing performance prediction methods heavily rely on the source data in their estimation process, limiting their applicability in a more realistic setting where only the trained model is accessible. The few methods that do not require source data exhibit considerably inferior performance. In this work, we propose a source-free approach centred on uncertainty-based estimation, using a generative model for calibration in the absence of source data. We establish connections between our approach for unsupervised calibration and temperature scaling. We then employ a gradient-based strategy to evaluate the correctness of the calibrated predictions. Our experiments on benchmark object recognition datasets reveal that existing source-based methods fall short with limited source sample availability. Furthermore, our approach significantly outperforms the current state-of-the-art source-free and source-based methods, affirming its effectiveness in domain-invariant performance estimation.

* Accepted in ECCV 2024

Via

Access Paper or Ask Questions

Embark on DenseQuest: A System for Selecting the Best Dense Retriever for a Custom Collection

Jul 09, 2024

Ekaterina Khramtsova, Teerapong Leelanupab, Shengyao Zhuang, Mahsa Baktashmotlagh, Guido Zuccon

Figure 1 for Embark on DenseQuest: A System for Selecting the Best Dense Retriever for a Custom Collection

Figure 2 for Embark on DenseQuest: A System for Selecting the Best Dense Retriever for a Custom Collection

Figure 3 for Embark on DenseQuest: A System for Selecting the Best Dense Retriever for a Custom Collection

Abstract:In this demo we present a web-based application for selecting an effective pre-trained dense retriever to use on a private collection. Our system, DenseQuest, provides unsupervised selection and ranking capabilities to predict the best dense retriever among a pool of available dense retrievers, tailored to an uploaded target collection. DenseQuest implements a number of existing approaches, including a recent, highly effective method powered by Large Language Models (LLMs), which requires neither queries nor relevance judgments. The system is designed to be intuitive and easy to use for those information retrieval engineers and researchers who need to identify a general-purpose dense retrieval model to encode or search a new private target collection. Our demonstration illustrates conceptual architecture and the different use case scenarios of the system implemented on the cloud, enabling universal access and use. DenseQuest is available at https://densequest.ielab.io.

* SIGIR2024 demo paper

Via

Access Paper or Ask Questions