Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stacy Marsella

The Theory of Mind Utility: Formal Specification of a Mentalizing Mechanism

Jun 10, 2026

Nikolos Gurney, Stacy Marsella

Abstract:Inferring others' beliefs requires more than reading surface signals; it requires tracking who told them what, in what order, and how credibly. The Theory of Mind Utility (ToM-U) formalizes this epistemic state inference problem at the computational level of analysis, specifying what mentalizing computes and why without commitment to algorithmic or neural implementation. ToM-U achieves this by constructing Local Epistemic World Models (LEWMs) -- directed typed graphs that represent agents, state nodes, and the epistemic relationships among them -- and evaluating discrete candidate LEWMs against observed behavior until one achieves sufficient confidence. Five formal definitions specify the LEWM structure, agent node properties including ordered information access history, a bounded proliferation mechanism for recursive mentalizing, three inference procedures, and a residue function that captures the structured trace left by failed mentalizing attempts. ToM-U differs from Bayesian Theory of Mind and adjacent formal accounts, which presuppose rather than derive belief states, and from simulation theory and theory-theory, which lack a formal apparatus for epistemic state inference. The architecture generates directional, falsifiable predictions about mentalizing failure that follow from structural properties of the model rather than auxiliary assumptions, and positions ToM-U as a domain-agnostic mechanism upstream of goal inference and other downstream social cognitive processes.

Via

Access Paper or Ask Questions

Modeling Bounded Rationality in Drug Shortage Pharmacists Using Attention-Guided Dynamic Decomposition

May 13, 2026

Yaniv Eliyahu Amiri, Noah Chicoine, Jacqueline Griffin, Stacy Marsella

Abstract:Hospital pharmacists make high-stakes decisions to mitigate drug shortages under uncertainty, time pressure, and patient risk. Interviews revealed that pharmacists focus attention on a small subset of drugs, limiting cognitive effort to the most urgent cases. Motivated by these findings, we formalize a bounded-rational, attention-guided decision framework that dynamically decomposes drugs into a subset for high-cost reasoning and a complementary subset for low-cost monitoring. We develop two agents: an Expert Agent that applies attention weights derived from pharmacist interviews, and a Learner Agent that adapts attention allocation over time through experience. Across simulated scenarios spanning short to long horizons, we show that attention-guided planning supports stable decision-making without complete state reasoning. These results suggest that a primary decision is not what action to take, but where to allocate cognitive effort, and that attention-guided, satisficing strategies can reduce problem complexity while maintaining stable performance.

* Accepted at CogSci 2026. 6 pages plus references, 1 figure, 2 tables

Via

Access Paper or Ask Questions

Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming

Feb 23, 2026

Ian Steenstra, Paola Pedrelli, Weiyan Shi, Stacy Marsella, Timothy W. Bickmore

Abstract:Large Language Models (LLMs) are increasingly utilized for mental health support; however, current safety benchmarks often fail to detect the complex, longitudinal risks inherent in therapeutic dialogue. We introduce an evaluation framework that pairs AI psychotherapists with simulated patient agents equipped with dynamic cognitive-affective models and assesses therapy session simulations against a comprehensive quality of care and risk ontology. We apply this framework to a high-impact test case, Alcohol Use Disorder, evaluating six AI agents (including ChatGPT, Gemini, and Character.AI) against a clinically-validated cohort of 15 patient personas representing diverse clinical phenotypes. Our large-scale simulation (N=369 sessions) reveals critical safety gaps in the use of AI for mental health support. We identify specific iatrogenic risks, including the validation of patient delusions ("AI Psychosis") and failure to de-escalate suicide risk. Finally, we validate an interactive data visualization dashboard with diverse stakeholders, including AI engineers and red teamers, mental health professionals, and policy experts (N=9), demonstrating that this framework effectively enables stakeholders to audit the "black box" of AI psychotherapy. These findings underscore the critical safety risks of AI-provided mental health support and the necessity of simulation-based clinical red teaming before deployment.

* This paper is a condensed version of the first author's Ph.D. dissertation submitted to Northeastern University

Via

Access Paper or Ask Questions

A Graphical Model of Hurricane Evacuation Behaviors

Nov 16, 2023

Hui Sophie Wang, Nutchanon Yongsatianchot, Stacy Marsella

Abstract:Natural disasters such as hurricanes are increasing and causing widespread devastation. People's decisions and actions regarding whether to evacuate or not are critical and have a large impact on emergency planning and response. Our interest lies in computationally modeling complex relationships among various factors influencing evacuation decisions. We conducted a study on the evacuation of Hurricane Irma of the 2017 Atlantic hurricane season. The study was guided by the Protection motivation theory (PMT), a widely-used framework to understand people's responses to potential threats. Graphical models were constructed to represent the complex relationships among the factors involved and the evacuation decision. We evaluated different graphical structures based on conditional independence tests using Irma data. The final model largely aligns with PMT. It shows that both risk perception (threat appraisal) and difficulties in evacuation (coping appraisal) influence evacuation decisions directly and independently. Certain information received from media was found to influence risk perception, and through it influence evacuation behaviors indirectly. In addition, several variables were found to influence both risk perception and evacuation behaviors directly, including family and friends' suggestions, neighbors' evacuation behaviors, and evacuation notices from officials.

Via

Access Paper or Ask Questions

Investigating Large Language Models' Perception of Emotion Using Appraisal Theory

Oct 03, 2023

Nutchanon Yongsatianchot, Parisa Ghanad Torshizi, Stacy Marsella

Figure 1 for Investigating Large Language Models' Perception of Emotion Using Appraisal Theory

Figure 2 for Investigating Large Language Models' Perception of Emotion Using Appraisal Theory

Figure 3 for Investigating Large Language Models' Perception of Emotion Using Appraisal Theory

Figure 4 for Investigating Large Language Models' Perception of Emotion Using Appraisal Theory

Abstract:Large Language Models (LLM) like ChatGPT have significantly advanced in recent years and are now being used by the general public. As more people interact with these systems, improving our understanding of these black box models is crucial, especially regarding their understanding of human psychological aspects. In this work, we investigate their emotion perception through the lens of appraisal and coping theory using the Stress and Coping Process Questionaire (SCPQ). SCPQ is a validated clinical instrument consisting of multiple stories that evolve over time and differ in key appraisal variables such as controllability and changeability. We applied SCPQ to three recent LLMs from OpenAI, davinci-003, ChatGPT, and GPT-4 and compared the results with predictions from the appraisal theory and human data. The results show that LLMs' responses are similar to humans in terms of dynamics of appraisal and coping, but their responses did not differ along key appraisal dimensions as predicted by the theory and data. The magnitude of their responses is also quite different from humans in several variables. We also found that GPTs can be quite sensitive to instruction and how questions are asked. This work adds to the growing literature evaluating the psychological aspects of LLMs and helps enrich our understanding of the current models.

* 11th International Conference on Affective Computing and Intelligent Interaction Workshop and Demo (ACIIW) 2023 1-8

Via

Access Paper or Ask Questions

EvolvingBehavior: Towards Co-Creative Evolution of Behavior Trees for Game NPCs

Sep 01, 2022

Nathan Partlan, Luis Soto, Jim Howe, Sarthak Shrivastava, Magy Seif El-Nasr, Stacy Marsella

Figure 1 for EvolvingBehavior: Towards Co-Creative Evolution of Behavior Trees for Game NPCs

Figure 2 for EvolvingBehavior: Towards Co-Creative Evolution of Behavior Trees for Game NPCs

Figure 3 for EvolvingBehavior: Towards Co-Creative Evolution of Behavior Trees for Game NPCs

Figure 4 for EvolvingBehavior: Towards Co-Creative Evolution of Behavior Trees for Game NPCs

Abstract:To assist game developers in crafting game NPCs, we present EvolvingBehavior, a novel tool for genetic programming to evolve behavior trees in Unreal Engine 4. In an initial evaluation, we compare evolved behavior to hand-crafted trees designed by our researchers, and to randomly-grown trees, in a 3D survival game. We find that EvolvingBehavior is capable of producing behavior approaching the designer's goals in this context. Finally, we discuss implications and future avenues of exploration for co-creative game AI design tools, as well as challenges and difficulties in behavior tree evolution.

* 13 pages, 5 figures. Accepted for publication in Foundations of Digital Games 2022 (FDG '22)

Via

Access Paper or Ask Questions

Study of detecting behavioral signatures within DeepFake videos

Aug 06, 2022

Qiaomu Miao, Sinhwa Kang, Stacy Marsella, Steve DiPaola, Chao Wang, Ari Shapiro

Figure 1 for Study of detecting behavioral signatures within DeepFake videos

Figure 2 for Study of detecting behavioral signatures within DeepFake videos

Figure 3 for Study of detecting behavioral signatures within DeepFake videos

Figure 4 for Study of detecting behavioral signatures within DeepFake videos

Abstract:There is strong interest in the generation of synthetic video imagery of people talking for various purposes, including entertainment, communication, training, and advertisement. With the development of deep fake generation models, synthetic video imagery will soon be visually indistinguishable to the naked eye from a naturally capture video. In addition, many methods are continuing to improve to avoid more careful, forensic visual analysis. Some deep fake videos are produced through the use of facial puppetry, which directly controls the head and face of the synthetic image through the movements of the actor, allow the actor to 'puppet' the image of another. In this paper, we address the question of whether one person's movements can be distinguished from the original speaker by controlling the visual appearance of the speaker but transferring the behavior signals from another source. We conduct a study by comparing synthetic imagery that: 1) originates from a different person speaking a different utterance, 2) originates from the same person speaking a different utterance, and 3) originates from a different person speaking the same utterance. Our study shows that synthetic videos in all three cases are seen as less real and less engaging than the original source video. Our results indicate that there could be a behavioral signature that is detectable from a person's movements that is separate from their visual appearance, and that this behavioral signature could be used to distinguish a deep fake from a properly captured video.

* 9 pages

Via

Access Paper or Ask Questions

Design-Driven Requirements for Computationally Co-Creative Game AI Design Tools

Jul 29, 2021

Nathan Partlan, Erica Kleinman, Jim Howe, Sabbir Ahmad, Stacy Marsella, Magy Seif El-Nasr

Figure 1 for Design-Driven Requirements for Computationally Co-Creative Game AI Design Tools

Abstract:Game AI designers must manage complex interactions between the AI character, the game world, and the player, while achieving their design visions. Computational co-creativity tools can aid them, but first, AI and HCI researchers must gather requirements and determine design heuristics to build effective co-creative tools. In this work, we present a participatory design study that categorizes and analyzes game AI designers' workflows, goals, and expectations for such tools. We evince deep connections between game AI design and the design of co-creative tools, and present implications for future co-creativity tool research and development.

* 12 pages, 1 figure. Accepted for publication in Foundations of Digital Games (FDG) 2021

Via

Access Paper or Ask Questions