Picture for Jonas Geiping

Jonas Geiping

Capability-Based Scaling Laws for LLM Red-Teaming

Add code
May 26, 2025
Viaarxiv icon

Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models

Add code
Apr 08, 2025
Viaarxiv icon

Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation

Add code
Feb 26, 2025
Viaarxiv icon

Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers

Add code
Feb 12, 2025
Viaarxiv icon

When, Where and Why to Average Weights?

Add code
Feb 10, 2025
Viaarxiv icon

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Add code
Feb 07, 2025
Viaarxiv icon

Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging

Add code
Feb 06, 2025
Figure 1 for Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging
Figure 2 for Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging
Figure 3 for Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging
Figure 4 for Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging
Viaarxiv icon

Great Models Think Alike and this Undermines AI Oversight

Add code
Feb 06, 2025
Viaarxiv icon

Training Data Reconstruction: Privacy due to Uncertainty?

Add code
Dec 11, 2024
Figure 1 for Training Data Reconstruction: Privacy due to Uncertainty?
Figure 2 for Training Data Reconstruction: Privacy due to Uncertainty?
Figure 3 for Training Data Reconstruction: Privacy due to Uncertainty?
Figure 4 for Training Data Reconstruction: Privacy due to Uncertainty?
Viaarxiv icon

A Realistic Threat Model for Large Language Model Jailbreaks

Add code
Oct 21, 2024
Viaarxiv icon