Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xisong Dong

Spontaneous Functional Differentiation in Large Language Models: A Brain-Like Intelligence Economy

Mar 31, 2026

Junjie Zhang, Zhen Shen, Gang Xiong, Xisong Dong

Abstract:The evolution of intelligence in artificial systems provides a unique opportunity to identify universal computational principles. Here we show that large language models spontaneously develop synergistic cores where information integration exceeds individual parts remarkably similar to the human brain. Using Integrated Information Decomposition across multiple architectures we find that middle layers exhibit synergistic processing while early and late layers rely on redundancy. This organization is dynamic and emerges as a physical phase transition as task difficulty increases. Crucially ablating synergistic components causes catastrophic performance loss confirming their role as the physical entity of abstract reasoning and bridging artificial and biological intelligence.

Via

Access Paper or Ask Questions

Grokking From Abstraction to Intelligence

Mar 31, 2026

Junjie Zhang, Zhen Shen, Gang Xiong, Xisong Dong

Abstract:Grokking in modular arithmetic has established itself as the quintessential fruit fly experiment, serving as a critical domain for investigating the mechanistic origins of model generalization. Despite its significance, existing research remains narrowly focused on specific local circuits or optimization tuning, largely overlooking the global structural evolution that fundamentally drives this phenomenon. We propose that grokking originates from a spontaneous simplification of internal model structures governed by the principle of parsimony. We integrate causal, spectral, and algorithmic complexity measures alongside Singular Learning Theory to reveal that the transition from memorization to generalization corresponds to the physical collapse of redundant manifolds and deep information compression, offering a novel perspective for understanding the mechanisms of model overfitting and generalization.

* 22page and 5 figures,In this paper, we analyze the grokking phenomenon from the perspective of Singular Learning Theory (SLT). This work is currently under review for ICML 2026

Via

Access Paper or Ask Questions