Picture for Daniel Deutsch

Daniel Deutsch

On the Role of Summary Content Units in Text Summarization Evaluation

Add code
Apr 02, 2024
Figure 1 for On the Role of Summary Content Units in Text Summarization Evaluation
Figure 2 for On the Role of Summary Content Units in Text Summarization Evaluation
Figure 3 for On the Role of Summary Content Units in Text Summarization Evaluation
Figure 4 for On the Role of Summary Content Units in Text Summarization Evaluation
Viaarxiv icon

Finding Replicable Human Evaluations via Stable Ranking Probability

Add code
Apr 01, 2024
Viaarxiv icon

Pinpoint, Not Criticize: Refining Large Language Models via Fine-Grained Actionable Feedback

Add code
Nov 15, 2023
Viaarxiv icon

There's no Data Like Better Data: Using QE Metrics for MT Data Filtering

Add code
Nov 09, 2023
Viaarxiv icon

The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics

Add code
Oct 30, 2023
Viaarxiv icon

Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level

Add code
Aug 28, 2023
Figure 1 for Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level
Figure 2 for Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level
Figure 3 for Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level
Figure 4 for Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level
Viaarxiv icon

The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation

Add code
Aug 14, 2023
Figure 1 for The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Figure 2 for The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Figure 3 for The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Figure 4 for The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Viaarxiv icon

Ties Matter: Modifying Kendall's Tau for Modern Metric Meta-Evaluation

Add code
May 23, 2023
Figure 1 for Ties Matter: Modifying Kendall's Tau for Modern Metric Meta-Evaluation
Figure 2 for Ties Matter: Modifying Kendall's Tau for Modern Metric Meta-Evaluation
Figure 3 for Ties Matter: Modifying Kendall's Tau for Modern Metric Meta-Evaluation
Figure 4 for Ties Matter: Modifying Kendall's Tau for Modern Metric Meta-Evaluation
Viaarxiv icon

Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization

Add code
Dec 28, 2022
Figure 1 for Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization
Figure 2 for Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization
Figure 3 for Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization
Figure 4 for Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization
Viaarxiv icon

On the Limitations of Reference-Free Evaluations of Generated Text

Add code
Oct 22, 2022
Figure 1 for On the Limitations of Reference-Free Evaluations of Generated Text
Figure 2 for On the Limitations of Reference-Free Evaluations of Generated Text
Figure 3 for On the Limitations of Reference-Free Evaluations of Generated Text
Figure 4 for On the Limitations of Reference-Free Evaluations of Generated Text
Viaarxiv icon