Alert button
Picture for Xuhui Zhou

Xuhui Zhou

Alert button

Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs

Add code
Bookmark button
Alert button
Mar 08, 2024
Xuhui Zhou, Zhe Su, Tiwalayo Eisape, Hyunwoo Kim, Maarten Sap

Figure 1 for Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
Figure 2 for Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
Figure 3 for Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
Figure 4 for Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
Viaarxiv icon

FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions

Add code
Bookmark button
Alert button
Oct 31, 2023
Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Yejin Choi, Maarten Sap

Figure 1 for FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
Figure 2 for FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
Figure 3 for FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
Figure 4 for FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
Viaarxiv icon

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Add code
Bookmark button
Alert button
Oct 27, 2023
Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi

Figure 1 for Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Figure 2 for Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Figure 3 for Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Figure 4 for Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Viaarxiv icon

SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Add code
Bookmark button
Alert button
Oct 18, 2023
Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap

Figure 1 for SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Figure 2 for SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Figure 3 for SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Figure 4 for SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Viaarxiv icon

WebArena: A Realistic Web Environment for Building Autonomous Agents

Add code
Bookmark button
Alert button
Jul 25, 2023
Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig

Figure 1 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 2 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 3 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 4 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Viaarxiv icon

COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements

Add code
Bookmark button
Alert button
Jun 09, 2023
Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, Maarten Sap

Figure 1 for COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
Figure 2 for COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
Figure 3 for COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
Figure 4 for COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
Viaarxiv icon

Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models

Add code
Bookmark button
Alert button
May 24, 2023
Natalie Shapira, Mosh Levy, Seyed Hossein Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap, Vered Shwartz

Figure 1 for Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Figure 2 for Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Figure 3 for Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Figure 4 for Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Viaarxiv icon

Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting

Add code
Bookmark button
Alert button
May 24, 2023
Akhila Yerukola, Xuhui Zhou, Maarten Sap

Figure 1 for Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Figure 2 for Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Figure 3 for Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Figure 4 for Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Viaarxiv icon

Learning to translate by learning to communicate

Add code
Bookmark button
Alert button
Jul 14, 2022
C. M. Downey, Leo Z. Liu, Xuhui Zhou, Shane Steinert-Threlkeld

Figure 1 for Learning to translate by learning to communicate
Figure 2 for Learning to translate by learning to communicate
Viaarxiv icon