Picture for Luis Fernando D'Haro

Luis Fernando D'Haro

EJ

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark

Add code
Jun 10, 2024
Viaarxiv icon

Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models

Add code
May 23, 2024
Figure 1 for Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Figure 2 for Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Figure 3 for Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Figure 4 for Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models
Viaarxiv icon

Awareness in robotics: An early perspective from the viewpoint of the EIC Pathfinder Challenge "Awareness Inside''

Add code
Feb 14, 2024
Figure 1 for Awareness in robotics: An early perspective from the viewpoint of the EIC Pathfinder Challenge "Awareness Inside''
Figure 2 for Awareness in robotics: An early perspective from the viewpoint of the EIC Pathfinder Challenge "Awareness Inside''
Figure 3 for Awareness in robotics: An early perspective from the viewpoint of the EIC Pathfinder Challenge "Awareness Inside''
Figure 4 for Awareness in robotics: An early perspective from the viewpoint of the EIC Pathfinder Challenge "Awareness Inside''
Viaarxiv icon

A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators

Add code
Dec 24, 2023
Viaarxiv icon

xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark

Add code
Oct 13, 2023
Figure 1 for xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark
Figure 2 for xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark
Figure 3 for xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark
Figure 4 for xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark
Viaarxiv icon

Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4

Add code
Jun 22, 2023
Figure 1 for Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4
Figure 2 for Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4
Figure 3 for Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4
Figure 4 for Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4
Viaarxiv icon

PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment

Add code
Dec 18, 2022
Figure 1 for PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Figure 2 for PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Figure 3 for PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Figure 4 for PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Viaarxiv icon

FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation

Add code
Oct 29, 2022
Figure 1 for FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation
Figure 2 for FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation
Figure 3 for FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation
Figure 4 for FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation
Viaarxiv icon

Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges

Add code
Mar 18, 2022
Figure 1 for Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges
Figure 2 for Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges
Viaarxiv icon

MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation

Add code
Dec 14, 2021
Figure 1 for MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation
Figure 2 for MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation
Figure 3 for MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation
Figure 4 for MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation
Viaarxiv icon