Picture for Yoav Levine

Yoav Levine

FormulaOne: Measuring the Depth of Algorithmic Reasoning Beyond Competitive Programming

Add code
Jul 17, 2025
Viaarxiv icon

Artificial Expert Intelligence through PAC-reasoning

Add code
Dec 03, 2024
Viaarxiv icon

Language model developers should report train-test overlap

Add code
Oct 10, 2024
Figure 1 for Language model developers should report train-test overlap
Figure 2 for Language model developers should report train-test overlap
Viaarxiv icon

Rationality Report Cards: Assessing the Economic Rationality of Large Language Models

Add code
Feb 14, 2024
Figure 1 for Rationality Report Cards: Assessing the Economic Rationality of Large Language Models
Figure 2 for Rationality Report Cards: Assessing the Economic Rationality of Large Language Models
Figure 3 for Rationality Report Cards: Assessing the Economic Rationality of Large Language Models
Figure 4 for Rationality Report Cards: Assessing the Economic Rationality of Large Language Models
Viaarxiv icon

Tradeoffs Between Alignment and Helpfulness in Language Models

Add code
Feb 05, 2024
Figure 1 for Tradeoffs Between Alignment and Helpfulness in Language Models
Figure 2 for Tradeoffs Between Alignment and Helpfulness in Language Models
Figure 3 for Tradeoffs Between Alignment and Helpfulness in Language Models
Figure 4 for Tradeoffs Between Alignment and Helpfulness in Language Models
Viaarxiv icon

Generating Benchmarks for Factuality Evaluation of Language Models

Add code
Jul 13, 2023
Figure 1 for Generating Benchmarks for Factuality Evaluation of Language Models
Figure 2 for Generating Benchmarks for Factuality Evaluation of Language Models
Figure 3 for Generating Benchmarks for Factuality Evaluation of Language Models
Figure 4 for Generating Benchmarks for Factuality Evaluation of Language Models
Viaarxiv icon

Human or Not? A Gamified Approach to the Turing Test

Add code
May 31, 2023
Figure 1 for Human or Not? A Gamified Approach to the Turing Test
Figure 2 for Human or Not? A Gamified Approach to the Turing Test
Figure 3 for Human or Not? A Gamified Approach to the Turing Test
Figure 4 for Human or Not? A Gamified Approach to the Turing Test
Viaarxiv icon

Fundamental Limitations of Alignment in Large Language Models

Add code
Apr 19, 2023
Viaarxiv icon

The Learnability of In-Context Learning

Add code
Mar 14, 2023
Viaarxiv icon

In-Context Retrieval-Augmented Language Models

Add code
Jan 31, 2023
Viaarxiv icon