Alert button
Picture for Meriem Boubdir

Meriem Boubdir

Alert button

Elo Uncovered: Robustness and Best Practices in Language Model Evaluation

Add code
Bookmark button
Alert button
Nov 29, 2023
Meriem Boubdir, Edward Kim, Beyza Ermis, Sara Hooker, Marzieh Fadaee

Viaarxiv icon

Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation

Add code
Bookmark button
Alert button
Oct 22, 2023
Meriem Boubdir, Edward Kim, Beyza Ermis, Marzieh Fadaee, Sara Hooker

Viaarxiv icon