Picture for Alessandro Scirè

Alessandro Scirè

Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering

Add code
Mar 19, 2025
Figure 1 for Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering
Figure 2 for Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering
Figure 3 for Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering
Figure 4 for Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering
Viaarxiv icon

Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis

Add code
Dec 02, 2024
Viaarxiv icon

Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!

Add code
Aug 25, 2024
Figure 1 for Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
Figure 2 for Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
Figure 3 for Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
Figure 4 for Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
Viaarxiv icon

FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction

Add code
Mar 04, 2024
Figure 1 for FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction
Figure 2 for FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction
Figure 3 for FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction
Figure 4 for FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction
Viaarxiv icon

Echoes from Alexandria: A Large Resource for Multilingual Book Summarization

Add code
Jun 07, 2023
Viaarxiv icon

Semantic Role Labeling Meets Definition Modeling: Using Natural Language to Describe Predicate-Argument Structures

Add code
Dec 02, 2022
Viaarxiv icon