Picture for Max Kleiman-Weiner

Max Kleiman-Weiner

How LLMs Distort Our Written Language

Add code
Mar 18, 2026
Viaarxiv icon

Sycophantic Chatbots Cause Delusional Spiraling, Even in Ideal Bayesians

Add code
Feb 22, 2026
Viaarxiv icon

When Empowerment Disempowers

Add code
Nov 06, 2025
Viaarxiv icon

Estimating the Empowerment of Language Model Agents

Add code
Sep 26, 2025
Figure 1 for Estimating the Empowerment of Language Model Agents
Figure 2 for Estimating the Empowerment of Language Model Agents
Figure 3 for Estimating the Empowerment of Language Model Agents
Figure 4 for Estimating the Empowerment of Language Model Agents
Viaarxiv icon

Preserving Sense of Agency: User Preferences for Robot Autonomy and User Control across Household Tasks

Add code
Jun 24, 2025
Figure 1 for Preserving Sense of Agency: User Preferences for Robot Autonomy and User Control across Household Tasks
Figure 2 for Preserving Sense of Agency: User Preferences for Robot Autonomy and User Control across Household Tasks
Figure 3 for Preserving Sense of Agency: User Preferences for Robot Autonomy and User Control across Household Tasks
Figure 4 for Preserving Sense of Agency: User Preferences for Robot Autonomy and User Control across Household Tasks
Viaarxiv icon

The Lock-in Hypothesis: Stagnation by Algorithm

Add code
Jun 06, 2025
Viaarxiv icon

Are Language Models Consequentialist or Deontological Moral Reasoners?

Add code
May 27, 2025
Viaarxiv icon

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Add code
Apr 20, 2025
Figure 1 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 2 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 3 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 4 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Viaarxiv icon

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

Add code
Oct 22, 2024
Figure 1 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 2 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 3 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 4 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Viaarxiv icon

Value Internalization: Learning and Generalizing from Social Reward

Add code
Jul 19, 2024
Figure 1 for Value Internalization: Learning and Generalizing from Social Reward
Figure 2 for Value Internalization: Learning and Generalizing from Social Reward
Figure 3 for Value Internalization: Learning and Generalizing from Social Reward
Figure 4 for Value Internalization: Learning and Generalizing from Social Reward
Viaarxiv icon