Alert button
Picture for Kaijie Zhu

Kaijie Zhu

Alert button

NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models

Add code
Bookmark button
Alert button
Mar 05, 2024
Lizhou Fan, Wenyue Hua, Xiang Li, Kaijie Zhu, Mingyu Jin, Lingyao Li, Haoyang Ling, Jinkui Chi, Jindong Wang, Xin Ma, Yongfeng Zhang

Figure 1 for NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
Figure 2 for NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
Figure 3 for NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
Figure 4 for NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
Viaarxiv icon

DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents

Add code
Bookmark button
Alert button
Feb 21, 2024
Kaijie Zhu, Jindong Wang, Qinlin Zhao, Ruochen Xu, Xing Xie

Viaarxiv icon

The Good, The Bad, and Why: Unveiling Emotions in Generative AI

Add code
Bookmark button
Alert button
Dec 19, 2023
Cheng Li, Jindong Wang, Yixuan Zhang, Kaijie Zhu, Xinyi Wang, Wenxin Hou, Jianxun Lian, Fang Luo, Qiang Yang, Xing Xie

Figure 1 for The Good, The Bad, and Why: Unveiling Emotions in Generative AI
Figure 2 for The Good, The Bad, and Why: Unveiling Emotions in Generative AI
Figure 3 for The Good, The Bad, and Why: Unveiling Emotions in Generative AI
Figure 4 for The Good, The Bad, and Why: Unveiling Emotions in Generative AI
Viaarxiv icon

PromptBench: A Unified Library for Evaluation of Large Language Models

Add code
Bookmark button
Alert button
Dec 13, 2023
Kaijie Zhu, Qinlin Zhao, Hao Chen, Jindong Wang, Xing Xie

Viaarxiv icon

CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents

Add code
Bookmark button
Alert button
Oct 26, 2023
Qinlin Zhao, Jindong Wang, Yixuan Zhang, Yiqiao Jin, Kaijie Zhu, Hao Chen, Xing Xie

Figure 1 for CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents
Figure 2 for CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents
Figure 3 for CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents
Figure 4 for CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents
Viaarxiv icon

DyVal: Graph-informed Dynamic Evaluation of Large Language Models

Add code
Bookmark button
Alert button
Oct 05, 2023
Kaijie Zhu, Jiaao Chen, Jindong Wang, Neil Zhenqiang Gong, Diyi Yang, Xing Xie

Figure 1 for DyVal: Graph-informed Dynamic Evaluation of Large Language Models
Figure 2 for DyVal: Graph-informed Dynamic Evaluation of Large Language Models
Figure 3 for DyVal: Graph-informed Dynamic Evaluation of Large Language Models
Figure 4 for DyVal: Graph-informed Dynamic Evaluation of Large Language Models
Viaarxiv icon

Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Add code
Bookmark button
Alert button
Aug 01, 2023
Kaijie Zhu, Jindong Wang, Xixu Hu, Xing Xie, Ge Yang

Figure 1 for Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning
Figure 2 for Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning
Figure 3 for Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning
Figure 4 for Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning
Viaarxiv icon

EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus

Add code
Bookmark button
Alert button
Aug 01, 2023
Cheng Li, Jindong Wang, Kaijie Zhu, Yixuan Zhang, Wenxin Hou, Jianxun Lian, Xing Xie

Figure 1 for EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus
Figure 2 for EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus
Figure 3 for EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus
Figure 4 for EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus
Viaarxiv icon

A Survey on Evaluation of Large Language Models

Add code
Bookmark button
Alert button
Jul 18, 2023
Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie

Figure 1 for A Survey on Evaluation of Large Language Models
Figure 2 for A Survey on Evaluation of Large Language Models
Figure 3 for A Survey on Evaluation of Large Language Models
Figure 4 for A Survey on Evaluation of Large Language Models
Viaarxiv icon