Alert button
Picture for Huan Sun

Huan Sun

Alert button

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Add code
Bookmark button
Alert button
Nov 27, 2023
Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

Viaarxiv icon

How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities

Add code
Bookmark button
Alert button
Nov 15, 2023
Lingbo Mo, Boshi Wang, Muhao Chen, Huan Sun

Viaarxiv icon

TableLlama: Towards Open Large Generalist Models for Tables

Add code
Bookmark button
Alert button
Nov 15, 2023
Tianshu Zhang, Xiang Yue, Yifei Li, Huan Sun

Viaarxiv icon

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning

Add code
Bookmark button
Alert button
Oct 03, 2023
Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

Figure 1 for MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Figure 2 for MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Figure 3 for MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Figure 4 for MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Viaarxiv icon

AgentBench: Evaluating LLMs as Agents

Add code
Bookmark button
Alert button
Aug 07, 2023
Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

Figure 1 for AgentBench: Evaluating LLMs as Agents
Figure 2 for AgentBench: Evaluating LLMs as Agents
Figure 3 for AgentBench: Evaluating LLMs as Agents
Figure 4 for AgentBench: Evaluating LLMs as Agents
Viaarxiv icon

Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System

Add code
Bookmark button
Alert button
Jul 29, 2023
Lingbo Mo, Shijie Chen, Ziru Chen, Xiang Deng, Ashley Lewis, Sunit Singh, Samuel Stevens, Chang-You Tai, Zhen Wang, Xiang Yue, Tianshu Zhang, Yu Su, Huan Sun

Figure 1 for Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System
Figure 2 for Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System
Figure 3 for Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System
Figure 4 for Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System
Viaarxiv icon

Biomedical Language Models are Robust to Sub-optimal Tokenization

Add code
Bookmark button
Alert button
Jul 10, 2023
Bernal Jiménez Gutiérrez, Huan Sun, Yu Su

Figure 1 for Biomedical Language Models are Robust to Sub-optimal Tokenization
Figure 2 for Biomedical Language Models are Robust to Sub-optimal Tokenization
Figure 3 for Biomedical Language Models are Robust to Sub-optimal Tokenization
Figure 4 for Biomedical Language Models are Robust to Sub-optimal Tokenization
Viaarxiv icon

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

Add code
Bookmark button
Alert button
Jun 16, 2023
Kai Zhang, Lingbo Mo, Wenhu Chen, Huan Sun, Yu Su

Figure 1 for MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Figure 2 for MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Figure 3 for MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Figure 4 for MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Viaarxiv icon