Picture for Max Kleiman-Weiner

Max Kleiman-Weiner

Preserving Sense of Agency: User Preferences for Robot Autonomy and User Control across Household Tasks

Add code
Jun 24, 2025
Viaarxiv icon

The Lock-in Hypothesis: Stagnation by Algorithm

Add code
Jun 06, 2025
Viaarxiv icon

Are Language Models Consequentialist or Deontological Moral Reasoners?

Add code
May 27, 2025
Viaarxiv icon

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Add code
Apr 20, 2025
Viaarxiv icon

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

Add code
Oct 22, 2024
Figure 1 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 2 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 3 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 4 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Viaarxiv icon

Value Internalization: Learning and Generalizing from Social Reward

Add code
Jul 19, 2024
Viaarxiv icon

Multilingual Trolley Problems for Language Models

Add code
Jul 02, 2024
Figure 1 for Multilingual Trolley Problems for Language Models
Figure 2 for Multilingual Trolley Problems for Language Models
Figure 3 for Multilingual Trolley Problems for Language Models
Figure 4 for Multilingual Trolley Problems for Language Models
Viaarxiv icon

Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents

Add code
Apr 25, 2024
Figure 1 for Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
Figure 2 for Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
Figure 3 for Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
Figure 4 for Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
Viaarxiv icon

CLadder: A Benchmark to Assess Causal Reasoning Capabilities of Language Models

Add code
Dec 07, 2023
Viaarxiv icon

Learning to Coordinate with Humans using Action Features

Add code
Jan 29, 2022
Figure 1 for Learning to Coordinate with Humans using Action Features
Figure 2 for Learning to Coordinate with Humans using Action Features
Figure 3 for Learning to Coordinate with Humans using Action Features
Figure 4 for Learning to Coordinate with Humans using Action Features
Viaarxiv icon