Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations

Add code
Apr 15, 2024

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: