Picture for Benjamin Roth

Benjamin Roth

Influential Training Data Retrieval for Explaining Verbalized Confidence of LLMs

Add code
Jan 15, 2026
Viaarxiv icon

Explaining Generalization of AI-Generated Text Detectors Through Linguistic Analysis

Add code
Jan 12, 2026
Viaarxiv icon

Calibration Is Not Enough: Evaluating Confidence Estimation Under Language Variations

Add code
Jan 12, 2026
Viaarxiv icon

Compact Example-Based Explanations for Language Models

Add code
Jan 07, 2026
Viaarxiv icon

Do LLM Self-Explanations Help Users Predict Model Behavior? Evaluating Counterfactual Simulatability with Pragmatic Perturbations

Add code
Jan 07, 2026
Viaarxiv icon

Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions

Add code
Dec 14, 2025
Figure 1 for Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions
Figure 2 for Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions
Figure 3 for Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions
Figure 4 for Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions
Viaarxiv icon

Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance

Add code
Aug 27, 2025
Viaarxiv icon

Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles

Add code
Jan 07, 2025
Viaarxiv icon

From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks

Add code
Sep 06, 2024
Viaarxiv icon

An Evaluation of Explanation Methods for Black-Box Detectors of Machine-Generated Text

Add code
Aug 26, 2024
Viaarxiv icon