Alert button
Picture for Xuanyu Lei

Xuanyu Lei

Alert button

A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation

Add code
Bookmark button
Alert button
Apr 04, 2024
Jifan Yu, Xiaohan Zhang, Yifan Xu, Xuanyu Lei, Zijun Yao, Jing Zhang, Lei Hou, Juanzi Li

Viaarxiv icon

Scaffolding Coordinates to Promote Vision-Language Coordination in Large Multi-Modal Models

Add code
Bookmark button
Alert button
Feb 19, 2024
Xuanyu Lei, Zonghan Yang, Xinrui Chen, Peng Li, Yang Liu

Viaarxiv icon

AlignBench: Benchmarking Chinese Alignment of Large Language Models

Add code
Bookmark button
Alert button
Dec 05, 2023
Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, Jie Tang

Viaarxiv icon

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

Add code
Bookmark button
Alert button
Nov 30, 2023
Pei Ke, Bosi Wen, Zhuoer Feng, Xiao Liu, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang

Viaarxiv icon

SafetyBench: Evaluating the Safety of Large Language Models with Multiple Choice Questions

Add code
Bookmark button
Alert button
Sep 13, 2023
Zhexin Zhang, Leqi Lei, Lindong Wu, Rui Sun, Yongkang Huang, Chong Long, Xiao Liu, Xuanyu Lei, Jie Tang, Minlie Huang

Figure 1 for SafetyBench: Evaluating the Safety of Large Language Models with Multiple Choice Questions
Figure 2 for SafetyBench: Evaluating the Safety of Large Language Models with Multiple Choice Questions
Figure 3 for SafetyBench: Evaluating the Safety of Large Language Models with Multiple Choice Questions
Figure 4 for SafetyBench: Evaluating the Safety of Large Language Models with Multiple Choice Questions
Viaarxiv icon

AgentBench: Evaluating LLMs as Agents

Add code
Bookmark button
Alert button
Aug 07, 2023
Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

Figure 1 for AgentBench: Evaluating LLMs as Agents
Figure 2 for AgentBench: Evaluating LLMs as Agents
Figure 3 for AgentBench: Evaluating LLMs as Agents
Figure 4 for AgentBench: Evaluating LLMs as Agents
Viaarxiv icon