Alert button
Picture for Shuyan Zhou

Shuyan Zhou

Alert button

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Add code
Bookmark button
Alert button
Apr 11, 2024
Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

Viaarxiv icon

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

Add code
Bookmark button
Alert button
Jan 24, 2024
Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried

Viaarxiv icon

WebArena: A Realistic Web Environment for Building Autonomous Agents

Add code
Bookmark button
Alert button
Jul 25, 2023
Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig

Figure 1 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 2 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 3 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 4 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Viaarxiv icon

Hierarchical Prompting Assists Large Language Model on Web Navigation

Add code
Bookmark button
Alert button
May 23, 2023
Abishek Sridhar, Robert Lo, Frank F. Xu, Hao Zhu, Shuyan Zhou

Figure 1 for Hierarchical Prompting Assists Large Language Model on Web Navigation
Figure 2 for Hierarchical Prompting Assists Large Language Model on Web Navigation
Figure 3 for Hierarchical Prompting Assists Large Language Model on Web Navigation
Figure 4 for Hierarchical Prompting Assists Large Language Model on Web Navigation
Viaarxiv icon

Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

Add code
Bookmark button
Alert button
May 01, 2023
Patrick Fernandes, Aman Madaan, Emmy Liu, António Farinhas, Pedro Henrique Martins, Amanda Bertsch, José G. C. de Souza, Shuyan Zhou, Tongshuang Wu, Graham Neubig, André F. T. Martins

Figure 1 for Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Figure 2 for Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Figure 3 for Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Viaarxiv icon

CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code

Add code
Bookmark button
Alert button
Feb 10, 2023
Shuyan Zhou, Uri Alon, Sumit Agarwal, Graham Neubig

Figure 1 for CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Figure 2 for CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Figure 3 for CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Figure 4 for CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Viaarxiv icon

Causal Reasoning of Entities and Events in Procedural Texts

Add code
Bookmark button
Alert button
Jan 29, 2023
Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch

Figure 1 for Causal Reasoning of Entities and Events in Procedural Texts
Figure 2 for Causal Reasoning of Entities and Events in Procedural Texts
Figure 3 for Causal Reasoning of Entities and Events in Procedural Texts
Figure 4 for Causal Reasoning of Entities and Events in Procedural Texts
Viaarxiv icon

Execution-Based Evaluation for Open-Domain Code Generation

Add code
Bookmark button
Alert button
Dec 20, 2022
Zhiruo Wang, Shuyan Zhou, Daniel Fried, Graham Neubig

Figure 1 for Execution-Based Evaluation for Open-Domain Code Generation
Figure 2 for Execution-Based Evaluation for Open-Domain Code Generation
Figure 3 for Execution-Based Evaluation for Open-Domain Code Generation
Figure 4 for Execution-Based Evaluation for Open-Domain Code Generation
Viaarxiv icon

PAL: Program-aided Language Models

Add code
Bookmark button
Alert button
Nov 18, 2022
Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, Graham Neubig

Figure 1 for PAL: Program-aided Language Models
Figure 2 for PAL: Program-aided Language Models
Figure 3 for PAL: Program-aided Language Models
Figure 4 for PAL: Program-aided Language Models
Viaarxiv icon