Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Johan Irving Søltoft

Grounding Text Embeddings in Stakeholder Associations

May 26, 2026

Jonathan Rystrøm, Sofie Burgos-Thorsen, Zihao Fu, Johan Irving Søltoft, Kenneth C. Enevoldsen, Chris Russell

Abstract:Text embeddings are widely used to analyse large corpora of complex texts. However, it is unclear whether the embeddings capture the same semantic distances as the human experts using them. Ensuring alignment between embedding representations and human intentions is essential for valid analyses. We present the Stakeholder Grounding Exercise, a method for making expert associations explicit and grounding embedding model results in human understanding. In our primary case study on Danish policy issues, we find that neural text embeddings are substantially less reliable than human experts (19-26 pp gap), and that this misalignment propagates to downstream clustering performance (Spearman $ρ=0.9$ between exercise ranking and cluster quality). A secondary study on US Federal AI use cases replicates the gap (16pp) in English, using a digital protocol and a different community of experts -- demonstrating that the gap is not an artefact of a single instrument or domain. The Stakeholder Grounding Exercise offers a practical method for assessing whether embedding models capture the semantic distinctions that matter most to domain experts.

Via

Access Paper or Ask Questions

Synthetic Interlocutors. Experiments with Generative AI to Prolong Ethnographic Encounters

Oct 15, 2024

Johan Irving Søltoft, Laura Kocksch, Anders Kristian Munk

Figure 1 for Synthetic Interlocutors. Experiments with Generative AI to Prolong Ethnographic Encounters

Figure 2 for Synthetic Interlocutors. Experiments with Generative AI to Prolong Ethnographic Encounters

Figure 3 for Synthetic Interlocutors. Experiments with Generative AI to Prolong Ethnographic Encounters

Figure 4 for Synthetic Interlocutors. Experiments with Generative AI to Prolong Ethnographic Encounters

Abstract:This paper introduces "Synthetic Interlocutors" for ethnographic research. Synthetic Interlocutors are chatbots ingested with ethnographic textual material (interviews and observations) by using Retrieval Augmented Generation (RAG). We integrated an open-source large language model with ethnographic data from three projects to explore two questions: Can RAG digest ethnographic material and act as ethnographic interlocutor? And, if so, can Synthetic Interlocutors prolong encounters with the field and extend our analysis? Through reflections on the process of building our Synthetic Interlocutors and an experimental collaborative workshop, we suggest that RAG can digest ethnographic materials, and it might lead to prolonged, yet uneasy ethnographic encounters that allowed us to partially recreate and re-visit fieldwork interactions while facilitating opportunities for novel analytic insights. Synthetic Interlocutors can produce collaborative, ambiguous and serendipitous moments.

Via

Access Paper or Ask Questions