Picture for Razvan Pascanu

Razvan Pascanu

Google DeepMind

What Can Grokking Teach Us About Learning Under Nonstationarity?

Add code
Jul 26, 2025
Viaarxiv icon

Optimizers Qualitatively Alter Solutions And We Should Leverage This

Add code
Jul 16, 2025
Viaarxiv icon

Meta-learning how to Share Credit among Macro-Actions

Add code
Jun 16, 2025
Viaarxiv icon

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Add code
Jun 05, 2025
Viaarxiv icon

Plasticity as the Mirror of Empowerment

Add code
May 15, 2025
Viaarxiv icon

On the generalization of language models from in-context learning and finetuning: a controlled study

Add code
May 01, 2025
Viaarxiv icon

LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

Add code
Apr 22, 2025
Viaarxiv icon

Hadamard product in deep learning: Introduction, Advances and Challenges

Add code
Apr 17, 2025
Viaarxiv icon

Why do LLMs attend to the first token?

Add code
Apr 03, 2025
Viaarxiv icon

NoProp: Training Neural Networks without Back-propagation or Forward-propagation

Add code
Mar 31, 2025
Figure 1 for NoProp: Training Neural Networks without Back-propagation or Forward-propagation
Figure 2 for NoProp: Training Neural Networks without Back-propagation or Forward-propagation
Figure 3 for NoProp: Training Neural Networks without Back-propagation or Forward-propagation
Figure 4 for NoProp: Training Neural Networks without Back-propagation or Forward-propagation
Viaarxiv icon