Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Freja Thoresen

A multilingual hallucination benchmark: MultiWikiQHalluA

May 04, 2026

Freja Thoresen, Dan Saattrup Smart

Abstract:Most hallucination evaluations focus on English, leaving it unclear whether findings transfer to lower-resource languages. We investigate faithfulness hallucinations, defined as model-generated content that is fluent and plausible but diverges from the provided input or is internally inconsistent. Leveraging the multilingual MultiWikiQA dataset, we utilize the LettuceDetect framework to create synthetic hallucination datasets for 306 languages, from which we train token-level hallucination classifiers for 30 European languages. In this work, we present evaluations of model hallucinations on a selection of languages: English, Danish, German, and Icelandic. Using these classifiers, we evaluate the hallucination rates for Qwen3-0.6B, Qwen3-14B, Gemma-3-12B-IT, cogito-v1-preview-qwen-32B, and cogito-v1-preview-llama-70B. Our classifiers reveal notably higher hallucination rates for Qwen3-0.6B (up to 60\% of answers containing at least one hallucination, peaking in Icelandic) and generally lower rates for larger models, with cogito-v1-preview-qwen-32B and cogito-v1-preview-llama-70B performing best on most languages. Hallucination rates are consistently higher for lower-resource languages, particularly Icelandic.

* Camera-ready version for RESOURCEFUL 2026

Via

Access Paper or Ask Questions

Insights into Lunar Mineralogy: An Unsupervised Approach for Clustering of the Moon Mineral Mapper (M3) spectral data

Nov 05, 2024

Freja Thoresen, Igor Drozdovskiy, Aidan Cowley, Magdelena Laban, Sebastien Besse, Sylvain Blunier

Abstract:This paper presents a novel method for mapping spectral features of the Moon using machine learning-based clustering of hyperspectral data from the Moon Mineral Mapper (M3) imaging spectrometer. The method uses a convolutional variational autoencoder to reduce the dimensionality of the spectral data and extract features of the spectra. Then, a k-means algorithm is applied to cluster the latent variables into five distinct groups, corresponding to dominant spectral features, which are related to the mineral composition of the Moon's surface. The resulting global spectral cluster map shows the distribution of the five clusters on the Moon, which consist of a mixture of, among others, plagioclase, pyroxene, olivine, and Fe-bearing minerals across the Moon's surface. The clusters are compared to the mineral maps from the Kaguya mission, which showed that the locations of the clusters overlap with the locations of high wt% of minerals such as plagioclase, clinopyroxene, and olivine. The paper demonstrates the usefulness of unbiased unsupervised learning for lunar mineral exploration and provides a comprehensive analysis of lunar mineralogy.

Via

Access Paper or Ask Questions

Breccia and basalt classification of thin sections of Apollo rocks with deep learning

Oct 28, 2024

Freja Thoresen, Aidan Cowley, Romeo Haak, Jonas Lewe, Clara Moriceau, Piotr Knapczyk, Victoria S. Engelschiøn

Figure 1 for Breccia and basalt classification of thin sections of Apollo rocks with deep learning

Figure 2 for Breccia and basalt classification of thin sections of Apollo rocks with deep learning

Figure 3 for Breccia and basalt classification of thin sections of Apollo rocks with deep learning

Figure 4 for Breccia and basalt classification of thin sections of Apollo rocks with deep learning

Abstract:Human exploration of the moon is expected to resume in the next decade, following the last such activities in the Apollo programme time. One of the major objectives of returning to the Moon is to continue retrieving geological samples, with a focus on collecting high-quality specimens to maximize scientific return. Tools that assist astronauts in making informed decisions about sample collection activities can maximize the scientific value of future lunar missions. A lunar rock classifier is a tool that can potentially provide the necessary information for astronauts to analyze lunar rock samples, allowing them to augment in-situ value identification of samples. Towards demonstrating the value of such a tool, in this paper, we introduce a framework for classifying rock types in thin sections of lunar rocks. We leverage the vast collection of petrographic thin-section images from the Apollo missions, captured under plane-polarized light (PPL), cross-polarised light (XPL), and reflected light at varying magnifications. Advanced machine learning methods, including contrastive learning, are applied to analyze these images and extract meaningful features. The contrastive learning approach fine-tunes a pre-trained Inception-Resnet-v2 network with the SimCLR loss function. The fine-tuned Inception-Resnet-v2 network can then extract essential features effectively from the thin-section images of Apollo rocks. A simple binary classifier is trained using transfer learning from the fine-tuned Inception-ResNet-v2 to 98.44\% ($\pm$1.47) accuracy in separating breccias from basalts.

Via

Access Paper or Ask Questions