Picture for Christopher Summerfield

Christopher Summerfield

One-shot emergency psychiatric triage across 15 frontier AI chatbots

Add code
Apr 28, 2026
Viaarxiv icon

Measuring and Mitigating Persona Distortions from AI Writing Assistance

Add code
Apr 24, 2026
Viaarxiv icon

Artificial intelligence can persuade people to take political actions

Add code
Apr 10, 2026
Viaarxiv icon

Ask don't tell: Reducing sycophancy in large language models

Add code
Feb 27, 2026
Viaarxiv icon

When Do LLM Preferences Predict Downstream Behavior?

Add code
Feb 21, 2026
Viaarxiv icon

Reward Models Inherit Value Biases from Pretraining

Add code
Jan 28, 2026
Viaarxiv icon

Can AI mediation improve democratic deliberation?

Add code
Jan 09, 2026
Viaarxiv icon

Reward Model Interpretability via Optimal and Pessimal Tokens

Add code
Jun 08, 2025
Figure 1 for Reward Model Interpretability via Optimal and Pessimal Tokens
Figure 2 for Reward Model Interpretability via Optimal and Pessimal Tokens
Figure 3 for Reward Model Interpretability via Optimal and Pessimal Tokens
Figure 4 for Reward Model Interpretability via Optimal and Pessimal Tokens
Viaarxiv icon

HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics

Add code
May 08, 2025
Viaarxiv icon

Increasing happiness through conversations with artificial intelligence

Add code
Apr 02, 2025
Viaarxiv icon