Picture for Lukas Galke Poech

Lukas Galke Poech

The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment

Add code
Jun 09, 2026
Viaarxiv icon

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

Add code
Jun 08, 2026
Viaarxiv icon

BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling

Add code
Jun 08, 2026
Viaarxiv icon

LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs

Add code
Jun 04, 2026
Viaarxiv icon

Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion

Add code
May 29, 2026
Viaarxiv icon

Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals

Add code
May 25, 2026
Viaarxiv icon

ChronoMedKG: A Temporally-Grounded Biomedical Knowledge Graph and Benchmark for Clinical Reasoning

Add code
May 21, 2026
Viaarxiv icon

SommBench: Assessing Sommelier Expertise of Language Models

Add code
Mar 12, 2026
Viaarxiv icon

FlexMoRE: A Flexible Mixture of Rank-heterogeneous Experts for Efficient Federatedly-trained Large Language Models

Add code
Feb 09, 2026
Viaarxiv icon

Training Language Models to Use Prolog as a Tool

Add code
Dec 08, 2025
Viaarxiv icon