Alert button
Picture for Caiming Xiong

Caiming Xiong

Alert button

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Add code
Bookmark button
Alert button
Apr 11, 2024
Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

Viaarxiv icon

What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases

Add code
Bookmark button
Alert button
Apr 03, 2024
Anthony Meng Huat Tiong, Junqi Zhao, Boyang Li, Junnan Li, Steven C. H. Hoi, Caiming Xiong

Viaarxiv icon

How Much are LLMs Contaminated? A Comprehensive Survey and the LLMSanitize Library

Add code
Bookmark button
Alert button
Mar 31, 2024
Mathieu Ravaut, Bosheng Ding, Fangkai Jiao, Hailin Chen, Xingxuan Li, Ruochen Zhao, Chengwei Qin, Caiming Xiong, Shafiq Joty

Viaarxiv icon

FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability

Add code
Bookmark button
Alert button
Feb 28, 2024
Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, Ran Xu, Wenpeng Yin, Caiming Xiong

Viaarxiv icon

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Add code
Bookmark button
Alert button
Feb 26, 2024
Jianguo Zhang, Tian Lan, Rithesh Murthy, Zhiwei Liu, Weiran Yao, Juntao Tan, Thai Hoang, Liangwei Yang, Yihao Feng, Zuxin Liu, Tulika Awalgaonkar, Juan Carlos Niebles, Silvio Savarese, Shelby Heinecke, Huan Wang, Caiming Xiong

Viaarxiv icon

AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System

Add code
Bookmark button
Alert button
Feb 23, 2024
Zhiwei Liu, Weiran Yao, Jianguo Zhang, Liangwei Yang, Zuxin Liu, Juntao Tan, Prafulla K. Choubey, Tian Lan, Jason Wu, Huan Wang, Shelby Heinecke, Caiming Xiong, Silvio Savarese

Viaarxiv icon

Text2Data: Low-Resource Data Generation with Textual Control

Add code
Bookmark button
Alert button
Feb 08, 2024
Shiyu Wang, Yihao Feng, Tian Lan, Ning Yu, Yu Bai, Ran Xu, Huan Wang, Caiming Xiong, Silvio Savarese

Viaarxiv icon

Unified Training of Universal Time Series Forecasting Transformers

Add code
Bookmark button
Alert button
Feb 04, 2024
Gerald Woo, Chenghao Liu, Akshat Kumar, Caiming Xiong, Silvio Savarese, Doyen Sahoo

Viaarxiv icon

TrustLLM: Trustworthiness in Large Language Models

Add code
Bookmark button
Alert button
Jan 25, 2024
Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang, Huan Zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, Ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

Figure 1 for TrustLLM: Trustworthiness in Large Language Models
Figure 2 for TrustLLM: Trustworthiness in Large Language Models
Figure 3 for TrustLLM: Trustworthiness in Large Language Models
Figure 4 for TrustLLM: Trustworthiness in Large Language Models
Viaarxiv icon

Causal Layering via Conditional Entropy

Add code
Bookmark button
Alert button
Jan 19, 2024
Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese

Viaarxiv icon