Picture for Benjamin Van Durme

Benjamin Van Durme

Johns Hopkins University

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

Add code
Jul 15, 2025
Viaarxiv icon

How Grounded is Wikipedia? A Study on Structured Evidential Support

Add code
Jun 14, 2025
Viaarxiv icon

Jailbreak Distillation: Renewable Safety Benchmarking

Add code
May 28, 2025
Viaarxiv icon

Rank-K: Test-Time Reasoning for Listwise Reranking

Add code
May 20, 2025
Figure 1 for Rank-K: Test-Time Reasoning for Listwise Reranking
Figure 2 for Rank-K: Test-Time Reasoning for Listwise Reranking
Figure 3 for Rank-K: Test-Time Reasoning for Listwise Reranking
Figure 4 for Rank-K: Test-Time Reasoning for Listwise Reranking
Viaarxiv icon

Always Tell Me The Odds: Fine-grained Conditional Probability Estimation

Add code
May 02, 2025
Figure 1 for Always Tell Me The Odds: Fine-grained Conditional Probability Estimation
Figure 2 for Always Tell Me The Odds: Fine-grained Conditional Probability Estimation
Figure 3 for Always Tell Me The Odds: Fine-grained Conditional Probability Estimation
Figure 4 for Always Tell Me The Odds: Fine-grained Conditional Probability Estimation
Viaarxiv icon

MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools

Add code
Apr 28, 2025
Figure 1 for MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools
Figure 2 for MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools
Figure 3 for MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools
Figure 4 for MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools
Viaarxiv icon

Certified Mitigation of Worst-Case LLM Copyright Infringement

Add code
Apr 22, 2025
Figure 1 for Certified Mitigation of Worst-Case LLM Copyright Infringement
Figure 2 for Certified Mitigation of Worst-Case LLM Copyright Infringement
Figure 3 for Certified Mitigation of Worst-Case LLM Copyright Infringement
Figure 4 for Certified Mitigation of Worst-Case LLM Copyright Infringement
Viaarxiv icon

Can LLMs Generate Tabular Summaries of Science Papers? Rethinking the Evaluation Protocol

Add code
Apr 14, 2025
Figure 1 for Can LLMs Generate Tabular Summaries of Science Papers? Rethinking the Evaluation Protocol
Figure 2 for Can LLMs Generate Tabular Summaries of Science Papers? Rethinking the Evaluation Protocol
Figure 3 for Can LLMs Generate Tabular Summaries of Science Papers? Rethinking the Evaluation Protocol
Figure 4 for Can LLMs Generate Tabular Summaries of Science Papers? Rethinking the Evaluation Protocol
Viaarxiv icon

Bonsai: Interpretable Tree-Adaptive Grounded Reasoning

Add code
Apr 04, 2025
Figure 1 for Bonsai: Interpretable Tree-Adaptive Grounded Reasoning
Figure 2 for Bonsai: Interpretable Tree-Adaptive Grounded Reasoning
Figure 3 for Bonsai: Interpretable Tree-Adaptive Grounded Reasoning
Figure 4 for Bonsai: Interpretable Tree-Adaptive Grounded Reasoning
Viaarxiv icon

SpectR: Dynamically Composing LM Experts with Spectral Routing

Add code
Apr 04, 2025
Viaarxiv icon