Picture for Thomas L. Griffiths

Thomas L. Griffiths

Large Language Models Develop Novel Social Biases Through Adaptive Exploration

Add code
Nov 08, 2025
Figure 1 for Large Language Models Develop Novel Social Biases Through Adaptive Exploration
Figure 2 for Large Language Models Develop Novel Social Biases Through Adaptive Exploration
Figure 3 for Large Language Models Develop Novel Social Biases Through Adaptive Exploration
Figure 4 for Large Language Models Develop Novel Social Biases Through Adaptive Exploration
Viaarxiv icon

Demystifying the Mechanisms Behind Emergent Exploration in Goal-conditioned RL

Add code
Oct 15, 2025
Viaarxiv icon

Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

Add code
Aug 19, 2025
Viaarxiv icon

Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models

Add code
Jul 10, 2025
Viaarxiv icon

VideoGameBench: Can Vision-Language Models complete popular video games?

Add code
May 23, 2025
Viaarxiv icon

Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems

Add code
May 23, 2025
Viaarxiv icon

Partner Modelling Emerges in Recurrent Agents (But Only When It Matters)

Add code
May 22, 2025
Viaarxiv icon

Steering Risk Preferences in Large Language Models by Aligning Behavioral and Neural Representations

Add code
May 16, 2025
Viaarxiv icon

Using Reinforcement Learning to Train Large Language Models to Explain Human Decisions

Add code
May 16, 2025
Figure 1 for Using Reinforcement Learning to Train Large Language Models to Explain Human Decisions
Figure 2 for Using Reinforcement Learning to Train Large Language Models to Explain Human Decisions
Figure 3 for Using Reinforcement Learning to Train Large Language Models to Explain Human Decisions
Figure 4 for Using Reinforcement Learning to Train Large Language Models to Explain Human Decisions
Viaarxiv icon

Predictability Shapes Adaptation: An Evolutionary Perspective on Modes of Learning in Transformers

Add code
May 14, 2025
Viaarxiv icon