Picture for Mark Dredze

Mark Dredze

Weird Generalization is Weirdly Brittle

Add code
Apr 11, 2026
Viaarxiv icon

Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

Add code
Mar 10, 2026
Viaarxiv icon

FIRE-Bench: Evaluating Agents on the Rediscovery of Scientific Insights

Add code
Feb 02, 2026
Viaarxiv icon

Knowing But Not Doing: Convergent Morality and Divergent Action in LLMs

Add code
Jan 12, 2026
Viaarxiv icon

Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation

Add code
Jan 10, 2026
Viaarxiv icon

Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles

Add code
Nov 08, 2025
Figure 1 for Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles
Figure 2 for Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles
Figure 3 for Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles
Figure 4 for Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles
Viaarxiv icon

Evaluating the Evaluators: Are readability metrics good measures of readability?

Add code
Aug 26, 2025
Viaarxiv icon

What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models

Add code
Jun 06, 2025
Figure 1 for What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models
Figure 2 for What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models
Figure 3 for What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models
Figure 4 for What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models
Viaarxiv icon

Label-Guided In-Context Learning for Named Entity Recognition

Add code
May 29, 2025
Viaarxiv icon

MedScore: Factuality Evaluation of Free-Form Medical Answers

Add code
May 24, 2025
Viaarxiv icon