Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

François Lancelot

ATLANTIS at SemEval-2025 Task 3: Detecting Hallucinated Text Spans in Question Answering

Aug 07, 2025

Catherine Kobus, François Lancelot, Marion-Cécile Martin, Nawal Ould Amer

Abstract:This paper presents the contributions of the ATLANTIS team to SemEval-2025 Task 3, focusing on detecting hallucinated text spans in question answering systems. Large Language Models (LLMs) have significantly advanced Natural Language Generation (NLG) but remain susceptible to hallucinations, generating incorrect or misleading content. To address this, we explored methods both with and without external context, utilizing few-shot prompting with a LLM, token-level classification or LLM fine-tuned on synthetic data. Notably, our approaches achieved top rankings in Spanish and competitive placements in English and German. This work highlights the importance of integrating relevant context to mitigate hallucinations and demonstrate the potential of fine-tuned models and prompt engineering.

Via

Access Paper or Ask Questions

A question-answering system for aircraft pilots' documentation

Nov 26, 2020

Alexandre Arnold, Gérard Dupont, Félix Furger, Catherine Kobus, François Lancelot

Figure 1 for A question-answering system for aircraft pilots' documentation

Figure 2 for A question-answering system for aircraft pilots' documentation

Figure 3 for A question-answering system for aircraft pilots' documentation

Figure 4 for A question-answering system for aircraft pilots' documentation

Abstract:The aerospace industry relies on massive collections of complex and technical documents covering system descriptions, manuals or procedures. This paper presents a question answering (QA) system that would help aircraft pilots access information in this documentation by naturally interacting with the system and asking questions in natural language. After describing each module of the dialog system, we present a multi-task based approach for the QA module which enables performance improvement on a Flight Crew Operating Manual (FCOM) dataset. A method to combine scores from the retriever and the QA modules is also presented.

* 11 pages, 8 figures

Via

Access Paper or Ask Questions