Picture for Kaixin Ma

Kaixin Ma

COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes

Add code
Sep 06, 2024
Figure 1 for COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes
Figure 2 for COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes
Figure 3 for COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes
Figure 4 for COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes
Viaarxiv icon

DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems

Add code
Jul 15, 2024
Figure 1 for DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
Figure 2 for DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
Figure 3 for DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
Figure 4 for DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
Viaarxiv icon

MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning

Add code
Apr 24, 2024
Figure 1 for MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning
Figure 2 for MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning
Figure 3 for MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning
Figure 4 for MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning
Viaarxiv icon

SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense

Add code
Apr 22, 2024
Figure 1 for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
Figure 2 for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
Figure 3 for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
Figure 4 for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
Viaarxiv icon

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

Add code
Jan 28, 2024
Figure 1 for WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Figure 2 for WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Figure 3 for WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Figure 4 for WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Viaarxiv icon

Dense X Retrieval: What Retrieval Granularity Should We Use?

Add code
Dec 12, 2023
Figure 1 for Dense X Retrieval: What Retrieval Granularity Should We Use?
Figure 2 for Dense X Retrieval: What Retrieval Granularity Should We Use?
Figure 3 for Dense X Retrieval: What Retrieval Granularity Should We Use?
Figure 4 for Dense X Retrieval: What Retrieval Granularity Should We Use?
Viaarxiv icon

Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models

Add code
Nov 15, 2023
Figure 1 for Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
Figure 2 for Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
Figure 3 for Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
Figure 4 for Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
Viaarxiv icon

BRAINTEASER: Lateral Thinking Puzzles for Large Language Models

Add code
Oct 10, 2023
Figure 1 for BRAINTEASER: Lateral Thinking Puzzles for Large Language Models
Figure 2 for BRAINTEASER: Lateral Thinking Puzzles for Large Language Models
Figure 3 for BRAINTEASER: Lateral Thinking Puzzles for Large Language Models
Figure 4 for BRAINTEASER: Lateral Thinking Puzzles for Large Language Models
Viaarxiv icon

LASER: LLM Agent with State-Space Exploration for Web Navigation

Add code
Sep 15, 2023
Figure 1 for LASER: LLM Agent with State-Space Exploration for Web Navigation
Figure 2 for LASER: LLM Agent with State-Space Exploration for Web Navigation
Figure 3 for LASER: LLM Agent with State-Space Exploration for Web Navigation
Figure 4 for LASER: LLM Agent with State-Space Exploration for Web Navigation
Viaarxiv icon

A Study of Situational Reasoning for Traffic Understanding

Add code
Jun 05, 2023
Figure 1 for A Study of Situational Reasoning for Traffic Understanding
Figure 2 for A Study of Situational Reasoning for Traffic Understanding
Figure 3 for A Study of Situational Reasoning for Traffic Understanding
Figure 4 for A Study of Situational Reasoning for Traffic Understanding
Viaarxiv icon