Abstract:We examine whether large language models (LLMs) can predict biased decision-making in conversational settings, and whether their predictions capture not only human cognitive biases but also how those effects change under cognitive load. In a pre-registered study (N = 1,648), participants completed six classic decision-making tasks via a chatbot with dialogues of varying complexity. Participants exhibited two well-documented cognitive biases: the Framing Effect and the Status Quo Bias. Increased dialogue complexity resulted in participants reporting higher mental demand. This increase in cognitive load selectively, but significantly, increased the effect of the biases, demonstrating the load-bias interaction. We then evaluated whether LLMs (GPT-4, GPT-5, and open-source models) could predict individual decisions given demographic information and prior dialogue. While results were mixed across choice problems, LLM predictions that incorporated dialogue context were significantly more accurate in several key scenarios. Importantly, their predictions reproduced the same bias patterns and load-bias interactions observed in humans. Across all models tested, the GPT-4 family consistently aligned with human behavior, outperforming GPT-5 and open-source models in both predictive accuracy and fidelity to human-like bias patterns. These findings advance our understanding of LLMs as tools for simulating human decision-making and inform the design of conversational agents that adapt to user biases.
Abstract:Heuristics and cognitive biases are an integral part of human decision-making. Automatically detecting a particular cognitive bias could enable intelligent tools to provide better decision-support. Detecting the presence of a cognitive bias currently requires a hand-crafted experiment and human interpretation. Our research aims to explore conversational agents as an effective tool to measure various cognitive biases in different domains. Our proposed conversational agent incorporates a bias measurement mechanism that is informed by the existing experimental designs and various experimental tasks identified in the literature. Our initial experiments to measure framing and loss-aversion biases indicate that the conversational agents can be effectively used to measure the biases.