Picture for Arman Cohan

Arman Cohan

FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents

Add code
Nov 08, 2024
Figure 1 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 2 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 3 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 4 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Viaarxiv icon

Bayesian Calibration of Win Rate Estimation with LLM Evaluators

Add code
Nov 07, 2024
Figure 1 for Bayesian Calibration of Win Rate Estimation with LLM Evaluators
Figure 2 for Bayesian Calibration of Win Rate Estimation with LLM Evaluators
Figure 3 for Bayesian Calibration of Win Rate Estimation with LLM Evaluators
Figure 4 for Bayesian Calibration of Win Rate Estimation with LLM Evaluators
Viaarxiv icon

M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models

Add code
Nov 06, 2024
Figure 1 for M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Figure 2 for M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Figure 3 for M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Figure 4 for M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Viaarxiv icon

TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models

Add code
Oct 30, 2024
Figure 1 for TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
Figure 2 for TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
Figure 3 for TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
Figure 4 for TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
Viaarxiv icon

COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences

Add code
Oct 30, 2024
Figure 1 for COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Figure 2 for COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Figure 3 for COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Figure 4 for COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Viaarxiv icon

MDCure: A Scalable Pipeline for Multi-Document Instruction-Following

Add code
Oct 30, 2024
Figure 1 for MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Figure 2 for MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Figure 3 for MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Figure 4 for MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Viaarxiv icon

P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains

Add code
Oct 11, 2024
Figure 1 for P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Figure 2 for P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Figure 3 for P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Figure 4 for P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains
Viaarxiv icon

ReIFE: Re-evaluating Instruction-Following Evaluation

Add code
Oct 09, 2024
Figure 1 for ReIFE: Re-evaluating Instruction-Following Evaluation
Figure 2 for ReIFE: Re-evaluating Instruction-Following Evaluation
Figure 3 for ReIFE: Re-evaluating Instruction-Following Evaluation
Figure 4 for ReIFE: Re-evaluating Instruction-Following Evaluation
Viaarxiv icon

MetaMath: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models

Add code
Sep 28, 2024
Figure 1 for MetaMath: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models
Figure 2 for MetaMath: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models
Figure 3 for MetaMath: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models
Figure 4 for MetaMath: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models
Viaarxiv icon

RouterRetriever: Exploring the Benefits of Routing over Multiple Expert Embedding Models

Add code
Sep 04, 2024
Viaarxiv icon