Picture for Martin Tutek

Martin Tutek

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models

Add code
Jan 24, 2026
Viaarxiv icon

PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics

Add code
Nov 17, 2025
Figure 1 for PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics
Figure 2 for PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics
Figure 3 for PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics
Figure 4 for PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics
Viaarxiv icon

ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs

Add code
Oct 01, 2025
Viaarxiv icon

Context Parametrization with Compositional Adapters

Add code
Sep 26, 2025
Viaarxiv icon

Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings

Add code
Jun 16, 2025
Viaarxiv icon

MIB: A Mechanistic Interpretability Benchmark

Add code
Apr 17, 2025
Figure 1 for MIB: A Mechanistic Interpretability Benchmark
Figure 2 for MIB: A Mechanistic Interpretability Benchmark
Figure 3 for MIB: A Mechanistic Interpretability Benchmark
Figure 4 for MIB: A Mechanistic Interpretability Benchmark
Viaarxiv icon

Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps

Add code
Feb 20, 2025
Viaarxiv icon

REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space

Add code
Jun 13, 2024
Figure 1 for REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Figure 2 for REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Figure 3 for REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Figure 4 for REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Viaarxiv icon

Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs

Add code
Jan 18, 2024
Figure 1 for Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Figure 2 for Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Figure 3 for Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Figure 4 for Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Viaarxiv icon

Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness

Add code
Oct 04, 2023
Figure 1 for Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Figure 2 for Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Figure 3 for Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Figure 4 for Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Viaarxiv icon