Picture for Dieuwke Hupkes

Dieuwke Hupkes

Jack

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

Add code
Jun 18, 2024
Viaarxiv icon

Quantifying Variance in Evaluation Benchmarks

Add code
Jun 14, 2024
Viaarxiv icon

Interpretability of Language Models via Task Spaces

Add code
Jun 10, 2024
Viaarxiv icon

From Form to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency

Add code
Apr 18, 2024
Viaarxiv icon

The ICL Consistency Test

Add code
Dec 08, 2023
Figure 1 for The ICL Consistency Test
Figure 2 for The ICL Consistency Test
Figure 3 for The ICL Consistency Test
Figure 4 for The ICL Consistency Test
Viaarxiv icon

WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models

Add code
Nov 27, 2023
Viaarxiv icon

Memorisation Cartography: Mapping out the Memorisation-Generalisation Continuum in Neural Machine Translation

Add code
Nov 09, 2023
Viaarxiv icon

The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks

Add code
Oct 26, 2023
Viaarxiv icon

Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning

Add code
Oct 20, 2023
Figure 1 for Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning
Figure 2 for Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning
Figure 3 for Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning
Figure 4 for Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning
Viaarxiv icon