Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Apr 29, 2025

Hanhua Hong, Chenghao Xiao, Yang Wang, Yiqi Liu, Wenge Rong, Chenghua Lin

Figure 1 for Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Figure 2 for Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Figure 3 for Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Figure 4 for Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Share this with someone who'll enjoy it:

Abstract:Evaluating natural language generation (NLG) systems is challenging due to the diversity of valid outputs. While human evaluation is the gold standard, it suffers from inconsistencies, lack of standardisation, and demographic biases, limiting reproducibility. LLM-based evaluation offers a scalable alternative but is highly sensitive to prompt design, where small variations can lead to significant discrepancies. In this work, we propose an inversion learning method that learns effective reverse mappings from model outputs back to their input instructions, enabling the automatic generation of highly effective, model-specific evaluation prompts. Our method requires only a single evaluation sample and eliminates the need for time-consuming manual prompt engineering, thereby improving both efficiency and robustness. Our work contributes toward a new direction for more robust and efficient LLM-based evaluation.

* 10 pages

View paper on

Share this with someone who'll enjoy it:

Title:Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Paper and Code