Alert button
Picture for Qihui Zhang

Qihui Zhang

Alert button

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark

Add code
Bookmark button
Alert button
Feb 07, 2024
Dongping Chen, Ruoxi Chen, Shilin Zhang, Yinuo Liu, Yaochen Wang, Huichi Zhou, Qihui Zhang, Pan Zhou, Yao Wan, Lichao Sun

Viaarxiv icon

TrustLLM: Trustworthiness in Large Language Models

Add code
Bookmark button
Alert button
Jan 25, 2024
Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang, Huan Zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, Ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

Figure 1 for TrustLLM: Trustworthiness in Large Language Models
Figure 2 for TrustLLM: Trustworthiness in Large Language Models
Figure 3 for TrustLLM: Trustworthiness in Large Language Models
Figure 4 for TrustLLM: Trustworthiness in Large Language Models
Viaarxiv icon

LLM-as-a-Coauthor: The Challenges of Detecting LLM-Human Mixcase

Add code
Bookmark button
Alert button
Jan 11, 2024
Chujie Gao, Dongping Chen, Qihui Zhang, Yue Huang, Yao Wan, Lichao Sun

Viaarxiv icon

MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use

Add code
Bookmark button
Alert button
Oct 12, 2023
Yue Huang, Jiawen Shi, Yuan Li, Chenrui Fan, Siyuan Wu, Qihui Zhang, Yixin Liu, Pan Zhou, Yao Wan, Neil Zhenqiang Gong, Lichao Sun

Figure 1 for MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
Figure 2 for MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
Figure 3 for MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
Figure 4 for MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
Viaarxiv icon

MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use

Add code
Bookmark button
Alert button
Oct 04, 2023
Yue Huang, Jiawen Shi, Yuan Li, Chenrui Fan, Siyuan Wu, Qihui Zhang, Yixin Liu, Pan Zhou, Yao Wan, Neil Zhenqiang Gong, Lichao Sun

Figure 1 for MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use
Figure 2 for MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use
Figure 3 for MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use
Figure 4 for MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use
Viaarxiv icon

TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models

Add code
Bookmark button
Alert button
Jun 20, 2023
Yue Huang, Qihui Zhang, Philip S. Y, Lichao Sun

Figure 1 for TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models
Figure 2 for TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models
Figure 3 for TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models
Figure 4 for TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models
Viaarxiv icon