Picture for Jixuan Chen

Jixuan Chen

COMMA: A Communicative Multimodal Multi-Agent Benchmark

Add code
Oct 10, 2024
Figure 1 for COMMA: A Communicative Multimodal Multi-Agent Benchmark
Figure 2 for COMMA: A Communicative Multimodal Multi-Agent Benchmark
Figure 3 for COMMA: A Communicative Multimodal Multi-Agent Benchmark
Figure 4 for COMMA: A Communicative Multimodal Multi-Agent Benchmark
Viaarxiv icon

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Add code
Jul 15, 2024
Figure 1 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 2 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 3 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Figure 4 for Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Viaarxiv icon

BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations

Add code
Jul 03, 2024
Viaarxiv icon

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Add code
Apr 11, 2024
Figure 1 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 2 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 3 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 4 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Viaarxiv icon