Picture for Dongfu Jiang

Dongfu Jiang

VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation

Add code
Jun 24, 2024
Figure 1 for VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
Figure 2 for VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
Figure 3 for VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
Figure 4 for VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
Viaarxiv icon

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Add code
Jun 16, 2024
Viaarxiv icon

GenAI Arena: An Open Evaluation Platform for Generative Models

Add code
Jun 06, 2024
Viaarxiv icon

MANTIS: Interleaved Multi-Image Instruction Tuning

Add code
May 02, 2024
Viaarxiv icon

VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation

Add code
Dec 22, 2023
Viaarxiv icon

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Add code
Nov 27, 2023
Figure 1 for MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Figure 2 for MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Figure 3 for MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Figure 4 for MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Viaarxiv icon

TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks

Add code
Oct 01, 2023
Figure 1 for TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks
Figure 2 for TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks
Figure 3 for TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks
Figure 4 for TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks
Viaarxiv icon

LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion

Add code
Jun 10, 2023
Figure 1 for LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
Figure 2 for LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
Figure 3 for LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
Figure 4 for LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
Viaarxiv icon

PairReranker: Pairwise Reranking for Natural Language Generation

Add code
Dec 20, 2022
Figure 1 for PairReranker: Pairwise Reranking for Natural Language Generation
Figure 2 for PairReranker: Pairwise Reranking for Natural Language Generation
Figure 3 for PairReranker: Pairwise Reranking for Natural Language Generation
Figure 4 for PairReranker: Pairwise Reranking for Natural Language Generation
Viaarxiv icon