Picture for Manohar Swaminathan

Manohar Swaminathan

PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data

Add code
Jun 21, 2024
Figure 1 for PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data
Figure 2 for PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data
Figure 3 for PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data
Figure 4 for PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data
Viaarxiv icon

Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs

Add code
Mar 01, 2024
Figure 1 for Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs
Figure 2 for Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs
Figure 3 for Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs
Figure 4 for Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs
Viaarxiv icon