Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patrick Krauss

Surviving by Serving: Functional Relevance Drives Self-Organization in Complex Adaptive Systems

Jun 25, 2026

Claus Metzner, Ali Ghebleh, Achim Schilling, Andreas Maier, Thomas Kinfe, Patrick Krauss

Abstract:Complex adaptive systems often develop organized structures without centralized control. Yet the local mechanisms by which functional organization emerges and persists remain incompletely understood. Here we propose Surviving by Serving (SBS) as a general principle of self-organization: components persist as long as their outputs are utilized by other components, whereas prolonged non-utilization promotes adaptation and exploration. To investigate this idea, we introduce a minimal multi-agent model in which agents transform shared resources and receive only local feedback when their outputs are subsequently utilized elsewhere in the system. Despite the absence of global objectives, the system spontaneously self-organizes into functional interaction networks. We observe the emergence of stable transformation chains, core-periphery organization, and the generation of novel states that enable previously inaccessible target conditions to be reached. Remarkably, self-sustaining interaction networks can arise even without external selection pressures, creating a pre-adaptive search phase from which later functional solutions emerge. These findings suggest that functional utilization may provide a simple, substrate-independent mechanism for the emergence and stabilization of organized structure in complex adaptive systems.

Via

Access Paper or Ask Questions

A Differentiable Atari VCS:A Complex, Fully Known Ground Truth for Explainable AI

Jun 21, 2026

Andreas Maier, Siming Bayer, Patrick Krauss

Abstract:Explanation requires ground truth: to verify an account of a system we must know its inner functioning-just what is missing where explainable AI (XAI) is most needed. Systems we can study fall into two camps. Simple, procedural one-decision trees, rule lists, sparse linear models-have a known but trivial mechanism, so explaining them tests nothing; genuinely complex ones-deep networks, real-world tasks-need XAI but have no ground-truth inner functioning, so an explanation can be plausible, confident, and wrong with no way to tell. We remove this dichotomy with a study object both genuinely complex and fully specified-inspectable by construction-and, so gradient methods apply, fully differentiable. We reimplement the Atari 2600 Video Computer System (VCS)-a real computer architecture, and the cradle of deep reinforcement learning-as two independent end-to-end differentiable emulators in Julia (jutari) and JAX (jaxtari), each validated bit-for-bit against xitari. Both reproduce xitari on all 64 supported Arcade Learning Environment (ALE) games: 64/64 byte-identical RAM and 64/64 pixel-identical screens. Treating the cartridge ROM as a weight tensor, RAM as a soft tape, and control flow as gates, we prove the differentiable (soft) execution equals the original (hard) one bit-for-bit in the forward pass at any finite temperature, while exposing surrogate gradients where the bit logic has none. The JAX port also opens a GPU path: batched differentiable rollouts reach millions of environment-steps/s on one commodity GPU. The system was built in roughly 137 active hours over 29 calendar days, much of it written autonomously by coding agents. This paper builds and validates the foundation, showing-theoretically and in a qualitative gradient study-that gradient-based XAI on it is feasible. Both ports' full code is available under the MIT license at https://github.com/akmaier/UnderstandingVCS.

* Submission for AAAI 2027

Via

Access Paper or Ask Questions

Convergent Representations of Linguistic Constructions in Human and Artificial Neural Systems

Mar 31, 2026

Pegah Ramezani, Thomas Kinfe, Andreas Maier, Achim Schilling, Patrick Krauss

Abstract:Understanding how the brain processes linguistic constructions is a central challenge in cognitive neuroscience and linguistics. Recent computational studies show that artificial neural language models spontaneously develop differentiated representations of Argument Structure Constructions (ASCs), generating predictions about when and how construction-level information emerges during processing. The present study tests these predictions in human neural activity using electroencephalography (EEG). Ten native English speakers listened to 200 synthetically generated sentences across four construction types (transitive, ditransitive, caused-motion, resultative) while neural responses were recorded. Analyses using time-frequency methods, feature extraction, and machine learning classification revealed construction-specific neural signatures emerging primarily at sentence-final positions, where argument structure becomes fully disambiguated, and most prominently in the alpha band. Pairwise classification showed reliable differentiation, especially between ditransitive and resultative constructions, while other pairs overlapped. Crucially, the temporal emergence and similarity structure of these effects mirror patterns in recurrent and transformer-based language models, where constructional representations arise during integrative processing stages. These findings support the view that linguistic constructions are neurally encoded as distinct form-meaning mappings, in line with Construction Grammar, and suggest convergence between biological and artificial systems on similar representational solutions. More broadly, this convergence is consistent with the idea that learning systems discover stable regions within an underlying representational landscape - recently termed a Platonic representational space - that constrains the emergence of efficient linguistic abstractions.

Via

Access Paper or Ask Questions

Probing Internal Representations of Multi-Word Verbs in Large Language Models

Feb 07, 2025

Hassane Kissane, Achim Schilling, Patrick Krauss

Abstract:This study investigates the internal representations of verb-particle combinations, called multi-word verbs, within transformer-based large language models (LLMs), specifically examining how these models capture lexical and syntactic properties at different neural network layers. Using the BERT architecture, we analyze the representations of its layers for two different verb-particle constructions: phrasal verbs like 'give up' and prepositional verbs like 'look at'. Our methodology includes training probing classifiers on the internal representations to classify these categories at both word and sentence levels. The results indicate that the model's middle layers achieve the highest classification accuracies. To further analyze the nature of these distinctions, we conduct a data separability test using the Generalized Discrimination Value (GDV). While GDV results show weak linear separability between the two verb types, probing classifiers still achieve high accuracy, suggesting that representations of these linguistic categories may be non-linearly separable. This aligns with previous research indicating that linguistic distinctions in neural networks are not always encoded in a linearly separable manner. These findings computationally support usage-based claims on the representation of verb-particle constructions and highlight the complex interaction between neural network architectures and linguistic structures.

Via

Access Paper or Ask Questions

Author-Specific Linguistic Patterns Unveiled: A Deep Learning Study on Word Class Distributions

Jan 17, 2025

Patrick Krauss, Achim Schilling

Abstract:Deep learning methods have been increasingly applied to computational linguistics to uncover patterns in text data. This study investigates author-specific word class distributions using part-of-speech (POS) tagging and bigram analysis. By leveraging deep neural networks, we classify literary authors based on POS tag vectors and bigram frequency matrices derived from their works. We employ fully connected and convolutional neural network architectures to explore the efficacy of unigram and bigram-based representations. Our results demonstrate that while unigram features achieve moderate classification accuracy, bigram-based models significantly improve performance, suggesting that sequential word class patterns are more distinctive of authorial style. Multi-dimensional scaling (MDS) visualizations reveal meaningful clustering of authors' works, supporting the hypothesis that stylistic nuances can be captured through computational methods. These findings highlight the potential of deep learning and linguistic feature analysis for author profiling and literary studies.

Via

Access Paper or Ask Questions

Refusal Behavior in Large Language Models: A Nonlinear Perspective

Jan 14, 2025

Fabian Hildebrandt, Andreas Maier, Patrick Krauss, Achim Schilling

Figure 1 for Refusal Behavior in Large Language Models: A Nonlinear Perspective

Figure 2 for Refusal Behavior in Large Language Models: A Nonlinear Perspective

Figure 3 for Refusal Behavior in Large Language Models: A Nonlinear Perspective

Figure 4 for Refusal Behavior in Large Language Models: A Nonlinear Perspective

Abstract:Refusal behavior in large language models (LLMs) enables them to decline responding to harmful, unethical, or inappropriate prompts, ensuring alignment with ethical standards. This paper investigates refusal behavior across six LLMs from three architectural families. We challenge the assumption of refusal as a linear phenomenon by employing dimensionality reduction techniques, including PCA, t-SNE, and UMAP. Our results reveal that refusal mechanisms exhibit nonlinear, multidimensional characteristics that vary by model architecture and layer. These findings highlight the need for nonlinear interpretability to improve alignment research and inform safer AI deployment strategies.

Via

Access Paper or Ask Questions

Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT

Jan 14, 2025

Awritrojit Banerjee, Achim Schilling, Patrick Krauss

Figure 1 for Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT

Figure 2 for Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT

Figure 3 for Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT

Figure 4 for Exploring Narrative Clustering in Large Language Models: A Layerwise Analysis of BERT

Abstract:This study investigates the internal mechanisms of BERT, a transformer-based large language model, with a focus on its ability to cluster narrative content and authorial style across its layers. Using a dataset of narratives developed via GPT-4, featuring diverse semantic content and stylistic variations, we analyze BERT's layerwise activations to uncover patterns of localized neural processing. Through dimensionality reduction techniques such as Principal Component Analysis (PCA) and Multidimensional Scaling (MDS), we reveal that BERT exhibits strong clustering based on narrative content in its later layers, with progressively compact and distinct clusters. While strong stylistic clustering might occur when narratives are rephrased into different text types (e.g., fables, sci-fi, kids' stories), minimal clustering is observed for authorial style specific to individual writers. These findings highlight BERT's prioritization of semantic content over stylistic features, offering insights into its representational capabilities and processing hierarchy. This study contributes to understanding how transformer models like BERT encode linguistic information, paving the way for future interdisciplinary research in artificial intelligence and cognitive neuroscience.

* arXiv admin note: text overlap with arXiv:2408.03062, arXiv:2408.04270, arXiv:2307.01577

Via

Access Paper or Ask Questions

Analysis and Visualization of Linguistic Structures in Large Language Models: Neural Representations of Verb-Particle Constructions in BERT

Dec 19, 2024

Hassane Kissane, Achim Schilling, Patrick Krauss

Figure 1 for Analysis and Visualization of Linguistic Structures in Large Language Models: Neural Representations of Verb-Particle Constructions in BERT

Figure 2 for Analysis and Visualization of Linguistic Structures in Large Language Models: Neural Representations of Verb-Particle Constructions in BERT

Figure 3 for Analysis and Visualization of Linguistic Structures in Large Language Models: Neural Representations of Verb-Particle Constructions in BERT

Figure 4 for Analysis and Visualization of Linguistic Structures in Large Language Models: Neural Representations of Verb-Particle Constructions in BERT

Abstract:This study investigates the internal representations of verb-particle combinations within transformer-based large language models (LLMs), specifically examining how these models capture lexical and syntactic nuances at different neural network layers. Employing the BERT architecture, we analyse the representational efficacy of its layers for various verb-particle constructions such as 'agree on', 'come back', and 'give up'. Our methodology includes a detailed dataset preparation from the British National Corpus, followed by extensive model training and output analysis through techniques like multi-dimensional scaling (MDS) and generalized discrimination value (GDV) calculations. Results show that BERT's middle layers most effectively capture syntactic structures, with significant variability in representational accuracy across different verb categories. These findings challenge the conventional uniformity assumed in neural network processing of linguistic elements and suggest a complex interplay between network architecture and linguistic representation. Our research contributes to a better understanding of how deep learning models comprehend and process language, offering insights into the potential and limitations of current neural approaches to linguistic analysis. This study not only advances our knowledge in computational linguistics but also prompts further research into optimizing neural architectures for enhanced linguistic precision.

Via

Access Paper or Ask Questions

Probing for Consciousness in Machines

Nov 25, 2024

Mathis Immertreu, Achim Schilling, Andreas Maier, Patrick Krauss

Figure 1 for Probing for Consciousness in Machines

Figure 2 for Probing for Consciousness in Machines

Figure 3 for Probing for Consciousness in Machines

Figure 4 for Probing for Consciousness in Machines

Abstract:This study explores the potential for artificial agents to develop core consciousness, as proposed by Antonio Damasio's theory of consciousness. According to Damasio, the emergence of core consciousness relies on the integration of a self model, informed by representations of emotions and feelings, and a world model. We hypothesize that an artificial agent, trained via reinforcement learning (RL) in a virtual environment, can develop preliminary forms of these models as a byproduct of its primary task. The agent's main objective is to learn to play a video game and explore the environment. To evaluate the emergence of world and self models, we employ probes-feedforward classifiers that use the activations of the trained agent's neural networks to predict the spatial positions of the agent itself. Our results demonstrate that the agent can form rudimentary world and self models, suggesting a pathway toward developing machine consciousness. This research provides foundational insights into the capabilities of artificial agents in mirroring aspects of human consciousness, with implications for future advancements in artificial intelligence.

Via

Access Paper or Ask Questions

Nonlinear Neural Dynamics and Classification Accuracy in Reservoir Computing

Nov 15, 2024

Claus Metzner, Achim Schilling, Andreas Maier, Patrick Krauss

Figure 1 for Nonlinear Neural Dynamics and Classification Accuracy in Reservoir Computing

Figure 2 for Nonlinear Neural Dynamics and Classification Accuracy in Reservoir Computing

Figure 3 for Nonlinear Neural Dynamics and Classification Accuracy in Reservoir Computing

Figure 4 for Nonlinear Neural Dynamics and Classification Accuracy in Reservoir Computing

Abstract:Reservoir computing - information processing based on untrained recurrent neural networks with random connections - is expected to depend on the nonlinear properties of the neurons and the resulting oscillatory, chaotic, or fixpoint dynamics of the network. However, the required degree of nonlinearity and the range of suitable dynamical regimes for a given task are not fully understood. To clarify these questions, we study the accuracy of a reservoir computer in artificial classification tasks of varying complexity, while tuning the neuron's degree of nonlinearity and the reservoir's dynamical regime. We find that, even for activation functions with extremely reduced nonlinearity, weak recurrent interactions and small input signals, the reservoir is able to compute useful representations, detectable only in higher order principal components, that render complex classificiation tasks linearly separable for the readout layer. When increasing the recurrent coupling, the reservoir develops spontaneous dynamical behavior. Nevertheless, the input-related computations can 'ride on top' of oscillatory or fixpoint attractors without much loss of accuracy, whereas chaotic dynamics reduces task performance more drastically. By tuning the system through the full range of dynamical phases, we find that the accuracy peaks both at the oscillatory/chaotic and at the chaotic/fixpoint phase boundaries, thus supporting the 'edge of chaos' hypothesis. Our results, in particular the robust weakly nonlinear operating regime, may offer new perspectives both for technical and biological neural networks with random connectivity.

Via

Access Paper or Ask Questions