Alert button

Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks

Add code
Bookmark button
Alert button
Oct 20, 2023
Andrea Sottana, Bin Liang, Kai Zou, Zheng Yuan

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: