Picture for Raffaella Bernardi

Raffaella Bernardi

CIMeC - Center for Mind/Brain Sciences, University of Trento

Movie Facts and Fibs (MF$^2$): A Benchmark for Long Movie Understanding

Add code
Jun 06, 2025
Viaarxiv icon

A MIND for Reasoning: Meta-learning for In-context Deduction

Add code
May 20, 2025
Viaarxiv icon

Playpen: An Environment for Exploring Learning Through Conversational Interaction

Add code
Apr 11, 2025
Viaarxiv icon

All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark

Add code
Feb 24, 2025
Viaarxiv icon

Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests

Add code
Feb 20, 2025
Viaarxiv icon

The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It

Add code
Feb 17, 2025
Viaarxiv icon

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

Add code
Jun 26, 2024
Figure 1 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 2 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 3 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 4 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Viaarxiv icon

Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain

Add code
Jun 25, 2024
Viaarxiv icon

A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic Inferences

Add code
Jun 17, 2024
Viaarxiv icon

Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy

Add code
Sep 11, 2021
Figure 1 for Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy
Figure 2 for Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy
Figure 3 for Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy
Figure 4 for Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy
Viaarxiv icon