Picture for Ivo Petrov

Ivo Petrov

BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs

Add code
Oct 06, 2025
Figure 1 for BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
Figure 2 for BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
Figure 3 for BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
Figure 4 for BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
Viaarxiv icon

MathArena: Evaluating LLMs on Uncontaminated Math Competitions

Add code
May 29, 2025
Figure 1 for MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Figure 2 for MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Figure 3 for MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Figure 4 for MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Viaarxiv icon

Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad

Add code
Mar 27, 2025
Figure 1 for Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Figure 2 for Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Figure 3 for Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Viaarxiv icon

GRAIN: Exact Graph Reconstruction from Gradients

Add code
Mar 03, 2025
Figure 1 for GRAIN: Exact Graph Reconstruction from Gradients
Figure 2 for GRAIN: Exact Graph Reconstruction from Gradients
Figure 3 for GRAIN: Exact Graph Reconstruction from Gradients
Figure 4 for GRAIN: Exact Graph Reconstruction from Gradients
Viaarxiv icon

DAGER: Exact Gradient Inversion for Large Language Models

Add code
May 24, 2024
Figure 1 for DAGER: Exact Gradient Inversion for Large Language Models
Figure 2 for DAGER: Exact Gradient Inversion for Large Language Models
Figure 3 for DAGER: Exact Gradient Inversion for Large Language Models
Figure 4 for DAGER: Exact Gradient Inversion for Large Language Models
Viaarxiv icon