Picture for Sina Zarrieß

Sina Zarrieß

Talking to a Know-It-All GPT or a Second-Guesser Claude? How Repair reveals unreliable Multi-Turn Behavior in LLMs

Add code
Apr 22, 2026
Viaarxiv icon

Reference Games as a Testbed for the Alignment of Model Uncertainty and Clarification Requests

Add code
Jan 12, 2026
Viaarxiv icon

Surprisal and Metaphor Novelty: Moderate Correlations and Divergent Scaling Effects

Add code
Jan 08, 2026
Viaarxiv icon

Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning)

Add code
Oct 23, 2025
Viaarxiv icon

Are BabyLMs Deaf to Gricean Maxims? A Pragmatic Evaluation of Sample-efficient Language Models

Add code
Oct 06, 2025
Figure 1 for Are BabyLMs Deaf to Gricean Maxims? A Pragmatic Evaluation of Sample-efficient Language Models
Figure 2 for Are BabyLMs Deaf to Gricean Maxims? A Pragmatic Evaluation of Sample-efficient Language Models
Figure 3 for Are BabyLMs Deaf to Gricean Maxims? A Pragmatic Evaluation of Sample-efficient Language Models
Figure 4 for Are BabyLMs Deaf to Gricean Maxims? A Pragmatic Evaluation of Sample-efficient Language Models
Viaarxiv icon

The InviTE Corpus: Annotating Invectives in Tudor English Texts for Computational Modeling

Add code
Sep 26, 2025
Viaarxiv icon

SceneGram: Conceptualizing and Describing Tangrams in Scene Context

Add code
Jun 13, 2025
Figure 1 for SceneGram: Conceptualizing and Describing Tangrams in Scene Context
Figure 2 for SceneGram: Conceptualizing and Describing Tangrams in Scene Context
Figure 3 for SceneGram: Conceptualizing and Describing Tangrams in Scene Context
Figure 4 for SceneGram: Conceptualizing and Describing Tangrams in Scene Context
Viaarxiv icon

Are Multimodal Large Language Models Pragmatically Competent Listeners in Simple Reference Resolution Tasks?

Add code
Jun 13, 2025
Viaarxiv icon

Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions

Add code
Jun 11, 2025
Viaarxiv icon

LLMs Struggle to Reject False Presuppositions when Misinformation Stakes are High

Add code
May 28, 2025
Viaarxiv icon