Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Josef Kuchař

Unravelling the Mechanisms of Manipulating Numbers in Language Models

Oct 30, 2025

Michal Štefánik, Timothee Mickus, Marek Kadlčík, Bertram Højer, Michal Spiegel, Raúl Vázquez, Aman Sinha, Josef Kuchař, Philipp Mondorf

Figure 1 for Unravelling the Mechanisms of Manipulating Numbers in Language Models

Figure 2 for Unravelling the Mechanisms of Manipulating Numbers in Language Models

Figure 3 for Unravelling the Mechanisms of Manipulating Numbers in Language Models

Figure 4 for Unravelling the Mechanisms of Manipulating Numbers in Language Models

Abstract:Recent work has shown that different large language models (LLMs) converge to similar and accurate input embedding representations for numbers. These findings conflict with the documented propensity of LLMs to produce erroneous outputs when dealing with numeric information. In this work, we aim to explain this conflict by exploring how language models manipulate numbers and quantify the lower bounds of accuracy of these mechanisms. We find that despite surfacing errors, different language models learn interchangeable representations of numbers that are systematic, highly accurate and universal across their hidden states and the types of input contexts. This allows us to create universal probes for each LLM and to trace information -- including the causes of output errors -- to specific layers. Our results lay a fundamental understanding of how pre-trained LLMs manipulate numbers and outline the potential of more accurate probing techniques in addressed refinements of LLMs' architectures.

Via

Access Paper or Ask Questions

Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers

Jun 10, 2025

Marek Kadlčík, Michal Štefánik, Timothee Mickus, Michal Spiegel, Josef Kuchař

Figure 1 for Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers

Figure 2 for Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers

Figure 3 for Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers

Figure 4 for Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers

Abstract:Pretrained language models (LMs) are prone to arithmetic errors. Existing work showed limited success in probing numeric values from models' representations, indicating that these errors can be attributed to the inherent unreliability of distributionally learned embeddings in representing exact quantities. However, we observe that previous probing methods are inadequate for the emergent structure of learned number embeddings with sinusoidal patterns. In response, we propose a novel probing technique that decodes numeric values from input embeddings with near-perfect accuracy across a range of open-source LMs. This proves that after the sole pre-training, LMs represent numbers with remarkable precision. Finally, we find that the embeddings' preciseness judged by our probe's accuracy explains a large portion of LM's errors in elementary arithmetic, and show that aligning the embeddings with the pattern discovered by our probe can mitigate these errors.

Via

Access Paper or Ask Questions