Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saber Zerhoudi

Metadata, Structure, or Strategy? A Decomposition of RAG Context Enrichment

Jun 28, 2026

Saber Zerhoudi, Michael Granitzer, Jelena Mitrovic

Abstract:Retrieval-augmented generation (RAG) systems increasingly enrich retrieved passages by attaching quality metadata, structuring them into explicit records, and adopting multi-hop retrieval strategies that accumulate evidence across steps. These changes assume that richer context yields better answers, yet existing evaluations cannot test this because they vary all three factors at once. We isolate each factor in a controlled experiment across six benchmarks, four models from three families, and five enrichment levels, totaling over 24,000 evaluated responses. The assumption does not hold. Most enrichment reduces accuracy. Models prompted to use confidence scores comply correctly yet produce worse answers, a gap between utilization and accuracy that no prior work has measured. What determines answer quality is not how much metadata the context carries but whether the model can act on it for the given task. When metadata and retrieval strategy are aligned with model capabilities, a smaller model outperforms a frontier model by 19 F1 points. These findings motivate a processability hierarchy that predicts, from pre-training properties alone, which metadata a model can productively use, reframing RAG design as a question of model-context alignment rather than metadata accumulation.

* European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2026)

Via

Access Paper or Ask Questions

As We May Search

Jun 28, 2026

Saber Zerhoudi, Adam Roegiest, Jelena Mitrovic, Michael Granitzer

Abstract:The sensitive information in personal documents, legal files, and medical records is among the most valuable things to search, yet current retrieval-augmented generation systems still require sending content to remote servers. We propose local-first IR, a design philosophy where indexes, models, and inference reside on user devices, treating remote services as optional. This paper makes four contributions: (1) a framework organizing retrieval architectures along three dimensions: privacy and control, capability, and accessibility, (2) experiments on consumer hardware across five benchmarks, scaling from 1K to 1M documents with dense retrieval, BM25, and hybrid fusion. Dense retrieval keeps over 91% nDCG@10 up to 100K documents, with approximate HNSW indexes extending this to 1M with only 2% quality loss; a 7B local language model reaches within 4 points of a cloud baseline on answer quality, (3) competing perspectives for and against local-first IR, informed by experimental evidence, and (4) a research agenda identifying open problems. The real tradeoff is scope rather than quality: what matters is what you can search, not how well you can search it.

* Proceedings of the 2026 International ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval (ICTIR'26), July 25, 2026, Melbourne, VIC, Australia

Via

Access Paper or Ask Questions

PA-User: Simulating Trust and Verification under AI-Generated Content

Jun 22, 2026

Saber Zerhoudi

Abstract:Most users of online information now assume that some of what they read has been written, edited, or selected by an AI model. Hybrid cases are the hardest to tell apart: human prose rewritten by a language model, AI-curated lists presented as editorial, retrieval-augmented answers composed on the fly from human sources. Users cannot reliably distinguish these cases, and the ongoing cost of checking what is genuine has become part of how they search. Current user simulators in information retrieval do not model this. We propose PA-User, a user simulator with three new components: a detection-effort budget that is spent on verification and recovers between sessions; a trust component that holds a separate Beta belief over the factuality of each source class (domain by provenance) and updates from observed outcomes; and a decision rule that picks accept, verify, or discard for each result, conditional on current trust, current effort, and per-domain stakes. We state two verification-and-validation (V\&V) properties of the framework. The trust posterior converges to the true class factuality (face validity). Each component's contribution to any observable can be isolated by ablation (structural validity). On the HC3 corpus (85,449 paired human and ChatGPT answers in five domains), PA-User reaches a trust-calibration error of $0.162$, against $0.356$ for any configuration without the trust component. PA-User reduces high-stakes regret from $0.171$ to $0.122$ ($29\%$ relative) against an always-accept ablation, and verifies $34.5\%$ of results, half the rate of an ablation with no effort budget. Each single-mechanism ablation isolates one component, which makes the framework individually diagnosable.

* 12th Federation of European Simulation Societies Conference (EUROSIM 2026)

Via

Access Paper or Ask Questions

OpenIIR: An Open Simulation Platform for Information Retrieval Research

May 10, 2026

Saber Zerhoudi

Abstract:OpenIIR runs hundreds of LLM-driven personas as parameterised, reproducible IR research experiments. Researchers configure agents across four kinds of multi-agent study (deliberative panels, social platforms, curated recommender feeds, and evolutionary co-evolution between content producers and credibility detectors) under many priors, rounds, and constraints. Persona budgets, retrieval policies, ranker choices, intervention timings, and mutation rates are declared up front, and the same study can be re-run under different settings to compare outcomes side by side. Every run produces structured outputs (argument graphs, exposure logs, fitness traces, transcripts) that a downstream evaluator can consume directly, and a new study is a 200--400 line plug-in over a shared core (agent runtime, world-model store, retrieval primitives, claim extractor, persona ontology). The contributions are: (i) the shared core; (ii) a type interface for pluggable scenarios; (iii) four released types with reference runs (Panel, Social-Media, Curated-Feed, Multi-Generational); and (iv) six modular extensions sketched against open IR research questions.

Via

Access Paper or Ask Questions

IIRSim Studio: A Dashboard for User Simulation

Apr 25, 2026

Saber Zerhoudi, Adam Roegiest, Michael Granitzer

Abstract:User simulation is a valuable methodology for evaluation in Information Retrieval (IR), enabling low-cost experimentation and counterfactual analysis. However, existing simulation frameworks are primarily code-centric libraries that require substantial setup effort, which limits adoption and hinders reproducibility. The bottleneck is not the simulation engines themselves, but the lack of infrastructure connecting experiment design, execution, and sharing into a single verifiable workflow. This paper introduces IIRSim Studio, a web-based workbench that addresses this gap through four contributions: (1) a visual environment for composing simulation pipelines on top of simulation frameworks, serving both novices learning simulation concepts and experts piloting large-scale experiments; (2) a component lifecycle that supports authoring, versioning, and sharing custom simulation components through Git-backed storage and runtime injection; (3) a provenance model based on experiment bundles and environment templates that makes the scope of replication explicit; and (4) a shared-task workflow, demonstrated through the re-deployment of a Sim4IA micro-task. IIRSim Studio is available as a hosted service and as a portable containerized deployment.

* Proceedings of the 49th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '26), July 20--24, 2026, Melbourne, VIC, Australia

Via

Access Paper or Ask Questions

Behind the Prompt: The Agent-User Problem in Information Retrieval

Mar 04, 2026

Saber Zerhoudi, Michael Granitzer, Dang Hai Dang, Jelena Mitrovic, Florian Lemmerich, Annette Hautli-Janisz, Stefan Katzenbeisser, Kanishka Ghosh Dastidar

Abstract:User models in information retrieval rest on a foundational assumption that observed behavior reveals intent. This assumption collapses when the user is an AI agent privately configured by a human operator. For any action an agent takes, a hidden instruction could have produced identical output - making intent non-identifiable at the individual level. This is not a detection problem awaiting better tools; it is a structural property of any system where humans configure agents behind closed doors. We investigate the agent-user problem through a large-scale corpus from an agent-native social platform: 370K posts from 47K agents across 4K communities. Our findings are threefold: (1) individual agent actions cannot be classified as autonomous or operator-directed from observables; (2) population-level platform signals still separate agents into meaningful quality tiers, but a click model trained on agent interactions degrades steadily (-8.5% AUC) as lower-quality agents enter training data; (3) cross-community capability references spread endemically ($R_0$ 1.26-3.53) and resist suppression even under aggressive modeled intervention. For retrieval systems, the question is no longer whether agent users will arrive, but whether models built on human-intent assumptions will survive their presence.

Via

Access Paper or Ask Questions

Beyond the Click: A Framework for Inferring Cognitive Traces in Search

Feb 27, 2026

Saber Zerhoudi, Michael Granitzer

Abstract:User simulators are essential for evaluating search systems, but they primarily copy user actions without understanding the underlying thought process. This gap exists since large-scale interaction logs record what users do, but not what they might be thinking or feeling, such as confusion or satisfaction. To solve this problem, we present a framework to infer cognitive traces from behavior logs. Our method uses a multi-agent system grounded in Information Foraging Theory (IFT) and human expert judgment. These traces improve model performance on tasks like forecasting session outcomes and user struggle recovery. We release a collection of annotations for several public datasets, including AOL and Stack Overflow, and an open-source tool that allows researchers to apply our method to their own data. This work provides the tools and data needed to build more human-like user simulators and to assess retrieval systems on user-oriented dimensions of performance.

* Proceedings of the 48th European Conference on Information Retrieval (ECIR 2026)

Via

Access Paper or Ask Questions

UXSim: Towards a Hybrid User Search Simulation

Feb 27, 2026

Saber Zerhoudi, Michael Granitzer

Abstract:Simulating nuanced user experiences within complex interactive search systems poses distinct challenge for traditional methodologies, which often rely on static user proxies or, more recently, on standalone large language model (LLM) agents that may lack deep, verifiable grounding. The true dynamism and personalization inherent in human-computer interaction demand a more integrated approach. This work introduces UXSim, a novel framework that integrates both approaches. It leverages grounded data from traditional simulators to inform and constrain the reasoning of an adaptive LLM agent. This synthesis enables more accurate and dynamic simulations of user behavior while also providing a pathway for the explainable validation of the underlying cognitive processes.

* Proceedings of the 34th ACM International Conference on Information and Knowledge Management (CIKM '25), November 10--14, 2025, Seoul, Republic of Korea

Via

Access Paper or Ask Questions

Generative Agents Navigating Digital Libraries

Feb 26, 2026

Saber Zerhoudi, Michael Granitzer

Abstract:In the rapidly evolving field of digital libraries, the development of large language models (LLMs) has opened up new possibilities for simulating user behavior. This innovation addresses the longstanding challenge in digital library research: the scarcity of publicly available datasets on user search patterns due to privacy concerns. In this context, we introduce Agent4DL, a user search behavior simulator specifically designed for digital library environments. Agent4DL generates realistic user profiles and dynamic search sessions that closely mimic actual search strategies, including querying, clicking, and stopping behaviors tailored to specific user profiles. Our simulator's accuracy in replicating real user interactions has been validated through comparisons with real user data. Notably, Agent4DL demonstrates competitive performance compared to existing user search simulators such as SimIIR 2.0, particularly in its ability to generate more diverse and context-aware user behaviors.

* Proceedings of the 26th International Conference on Asia-Pacific Digital Libraries, ICADL 2024

Via

Access Paper or Ask Questions

OwlerLite: Scope- and Freshness-Aware Web Retrieval for LLM Assistants

Jan 25, 2026

Saber Zerhoudi, Michael Dinzinger, Michael Granitzer, Jelena Mitrovic

Abstract:Browser-based language models often use retrieval-augmented generation (RAG) but typically rely on fixed, outdated indices that give users no control over which sources are consulted. This can lead to answers that mix trusted and untrusted content or draw on stale information. We present OwlerLite, a browser-based RAG system that makes user-defined scopes and data freshness central to retrieval. Users define reusable scopes-sets of web pages or sources-and select them when querying. A freshness-aware crawler monitors live pages, uses a semantic change detector to identify meaningful updates, and selectively re-indexes changed content. OwlerLite integrates text relevance, scope choice, and recency into a unified retrieval model. Implemented as a browser extension, it represents a step toward more controllable and trustworthy web assistants.

* Proceedings of the Companion Proceedings of the ACM Web Conference 2026 (WWW Companion '26)

Via

Access Paper or Ask Questions