Picture for Johannes von Oswald

Johannes von Oswald

Institute of Neuroinformatics, ETH Zürich and University of Zürich, Zürich, Switzerland

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Add code
Dec 24, 2025
Viaarxiv icon

Do Depth-Grown Models Overcome the Curse of Depth? An In-Depth Analysis

Add code
Dec 09, 2025
Viaarxiv icon

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Add code
Jun 05, 2025
Viaarxiv icon

Multi-agent cooperation through learning-aware policy gradients

Add code
Oct 24, 2024
Figure 1 for Multi-agent cooperation through learning-aware policy gradients
Figure 2 for Multi-agent cooperation through learning-aware policy gradients
Figure 3 for Multi-agent cooperation through learning-aware policy gradients
Figure 4 for Multi-agent cooperation through learning-aware policy gradients
Viaarxiv icon

Learning Randomized Algorithms with Transformers

Add code
Aug 20, 2024
Viaarxiv icon

When can transformers compositionally generalize in-context?

Add code
Jul 17, 2024
Figure 1 for When can transformers compositionally generalize in-context?
Figure 2 for When can transformers compositionally generalize in-context?
Figure 3 for When can transformers compositionally generalize in-context?
Figure 4 for When can transformers compositionally generalize in-context?
Viaarxiv icon

State Soup: In-Context Skill Learning, Retrieval and Mixing

Add code
Jun 12, 2024
Viaarxiv icon

Linear Transformers are Versatile In-Context Learners

Add code
Feb 21, 2024
Figure 1 for Linear Transformers are Versatile In-Context Learners
Figure 2 for Linear Transformers are Versatile In-Context Learners
Figure 3 for Linear Transformers are Versatile In-Context Learners
Figure 4 for Linear Transformers are Versatile In-Context Learners
Viaarxiv icon

Discovering modular solutions that generalize compositionally

Add code
Dec 22, 2023
Viaarxiv icon

Uncovering mesa-optimization algorithms in Transformers

Add code
Sep 11, 2023
Figure 1 for Uncovering mesa-optimization algorithms in Transformers
Figure 2 for Uncovering mesa-optimization algorithms in Transformers
Figure 3 for Uncovering mesa-optimization algorithms in Transformers
Figure 4 for Uncovering mesa-optimization algorithms in Transformers
Viaarxiv icon