Picture for Nino Scherrer

Nino Scherrer

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Add code
Dec 24, 2025
Viaarxiv icon

Do Depth-Grown Models Overcome the Curse of Depth? An In-Depth Analysis

Add code
Dec 09, 2025
Viaarxiv icon

No for Some, Yes for Others: Persona Prompts and Other Sources of False Refusal in Language Models

Add code
Sep 09, 2025
Viaarxiv icon

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Add code
Jun 05, 2025
Viaarxiv icon

Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection

Add code
Apr 02, 2025
Viaarxiv icon

Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling

Add code
Jan 23, 2025
Figure 1 for Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling
Viaarxiv icon

Multi-agent cooperation through learning-aware policy gradients

Add code
Oct 24, 2024
Figure 1 for Multi-agent cooperation through learning-aware policy gradients
Figure 2 for Multi-agent cooperation through learning-aware policy gradients
Figure 3 for Multi-agent cooperation through learning-aware policy gradients
Figure 4 for Multi-agent cooperation through learning-aware policy gradients
Viaarxiv icon

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Add code
Apr 18, 2024
Figure 1 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 2 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 3 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 4 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Viaarxiv icon

FinanceBench: A New Benchmark for Financial Question Answering

Add code
Nov 20, 2023
Figure 1 for FinanceBench: A New Benchmark for Financial Question Answering
Figure 2 for FinanceBench: A New Benchmark for Financial Question Answering
Figure 3 for FinanceBench: A New Benchmark for Financial Question Answering
Figure 4 for FinanceBench: A New Benchmark for Financial Question Answering
Viaarxiv icon

SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models

Add code
Nov 14, 2023
Figure 1 for SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
Figure 2 for SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
Figure 3 for SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
Figure 4 for SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
Viaarxiv icon