Picture for Robert P. Dick

Robert P. Dick

SD-MoE: Spectral Decomposition for Effective Expert Specialization

Add code
Feb 13, 2026
Viaarxiv icon

Multi-Head Attention as a Source of Catastrophic Forgetting in MoE Transformers

Add code
Feb 13, 2026
Viaarxiv icon

A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language

Add code
Aug 22, 2024
Viaarxiv icon

Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models

Add code
May 13, 2024
Figure 1 for Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models
Figure 2 for Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models
Figure 3 for Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models
Figure 4 for Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models
Viaarxiv icon

How Capable Can a Transformer Become? A Study on Synthetic, Interpretable Tasks

Add code
Nov 21, 2023
Viaarxiv icon

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Add code
Nov 21, 2023
Viaarxiv icon

In-Context Learning Dynamics with Random Binary Sequences

Add code
Oct 26, 2023
Figure 1 for In-Context Learning Dynamics with Random Binary Sequences
Figure 2 for In-Context Learning Dynamics with Random Binary Sequences
Figure 3 for In-Context Learning Dynamics with Random Binary Sequences
Figure 4 for In-Context Learning Dynamics with Random Binary Sequences
Viaarxiv icon

Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task

Add code
Oct 13, 2023
Figure 1 for Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task
Figure 2 for Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task
Figure 3 for Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task
Figure 4 for Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task
Viaarxiv icon

Mechanistic Mode Connectivity

Add code
Nov 15, 2022
Figure 1 for Mechanistic Mode Connectivity
Figure 2 for Mechanistic Mode Connectivity
Figure 3 for Mechanistic Mode Connectivity
Figure 4 for Mechanistic Mode Connectivity
Viaarxiv icon

Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering

Add code
May 23, 2022
Figure 1 for Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering
Figure 2 for Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering
Figure 3 for Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering
Figure 4 for Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering
Viaarxiv icon