Alert button
Picture for Wenxiang Jiao

Wenxiang Jiao

Alert button

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments

Mar 18, 2024
Jen-tse Huang, Eric John Li, Man Ho Lam, Tian Liang, Wenxuan Wang, Youliang Yuan, Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Michael R. Lyu

Viaarxiv icon

Unsupervised Sign Language Translation and Generation

Feb 12, 2024
Zhengsheng Guo, Zhiwei He, Wenxiang Jiao, Xing Wang, Rui Wang, Kehai Chen, Zhaopeng Tu, Yong Xu, Min Zhang

Viaarxiv icon

Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model

Jan 23, 2024
Zhiwei He, Xing Wang, Wenxiang Jiao, Zhuosheng Zhang, Rui Wang, Shuming Shi, Zhaopeng Tu

Viaarxiv icon

The Earth is Flat? Unveiling Factual Errors in Large Language Models

Jan 01, 2024
Wenxuan Wang, Juluan Shi, Zhaopeng Tu, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu

Viaarxiv icon

A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models

Jan 01, 2024
Yuxuan Wan, Wenxuan Wang, Yiliu Yang, Youliang Yuan, Jen-tse Huang, Pinjia He, Wenxiang Jiao, Michael R. Lyu

Viaarxiv icon

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models

Nov 06, 2023
Tian Liang, Zhiwei He, Jen-tse Huang, Wenxuan Wang, Wenxiang Jiao, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi, Xing Wang

Figure 1 for Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models
Figure 2 for Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models
Figure 3 for Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models
Figure 4 for Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models
Viaarxiv icon

Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models

Oct 19, 2023
Wenxuan Wang, Wenxiang Jiao, Jingyuan Huang, Ruyi Dai, Jen-tse Huang, Zhaopeng Tu, Michael R. Lyu

Viaarxiv icon

Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench

Oct 02, 2023
Jen-tse Huang, Wenxuan Wang, Eric John Li, Man Ho Lam, Shujie Ren, Youliang Yuan, Wenxiang Jiao, Zhaopeng Tu, Michael R. Lyu

Figure 1 for Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench
Figure 2 for Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench
Figure 3 for Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench
Figure 4 for Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench
Viaarxiv icon

All Languages Matter: On the Multilingual Safety of Large Language Models

Oct 02, 2023
Wenxuan Wang, Zhaopeng Tu, Chang Chen, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu

Figure 1 for All Languages Matter: On the Multilingual Safety of Large Language Models
Figure 2 for All Languages Matter: On the Multilingual Safety of Large Language Models
Figure 3 for All Languages Matter: On the Multilingual Safety of Large Language Models
Figure 4 for All Languages Matter: On the Multilingual Safety of Large Language Models
Viaarxiv icon