Picture for Martin Tutek

Martin Tutek

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models

Add code
Jan 24, 2026
Viaarxiv icon

PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics

Add code
Nov 17, 2025
Figure 1 for PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics
Figure 2 for PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics
Figure 3 for PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics
Figure 4 for PragWorld: A Benchmark Evaluating LLMs' Local World Model under Minimal Linguistic Alterations and Conversational Dynamics
Viaarxiv icon

ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs

Add code
Oct 01, 2025
Viaarxiv icon

Context Parametrization with Compositional Adapters

Add code
Sep 26, 2025
Figure 1 for Context Parametrization with Compositional Adapters
Figure 2 for Context Parametrization with Compositional Adapters
Figure 3 for Context Parametrization with Compositional Adapters
Figure 4 for Context Parametrization with Compositional Adapters
Viaarxiv icon

Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings

Add code
Jun 16, 2025
Figure 1 for Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings
Figure 2 for Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings
Figure 3 for Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings
Figure 4 for Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings
Viaarxiv icon

MIB: A Mechanistic Interpretability Benchmark

Add code
Apr 17, 2025
Figure 1 for MIB: A Mechanistic Interpretability Benchmark
Figure 2 for MIB: A Mechanistic Interpretability Benchmark
Figure 3 for MIB: A Mechanistic Interpretability Benchmark
Figure 4 for MIB: A Mechanistic Interpretability Benchmark
Viaarxiv icon

Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps

Add code
Feb 20, 2025
Viaarxiv icon

REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space

Add code
Jun 13, 2024
Figure 1 for REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Figure 2 for REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Figure 3 for REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Figure 4 for REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Viaarxiv icon

Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs

Add code
Jan 18, 2024
Figure 1 for Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Figure 2 for Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Figure 3 for Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Figure 4 for Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Viaarxiv icon

Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness

Add code
Oct 04, 2023
Figure 1 for Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Figure 2 for Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Figure 3 for Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Figure 4 for Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Viaarxiv icon