MastermindEval: A Simple But Scalable Reasoning Benchmark

Add code
Mar 11, 2025
Figure 1 for MastermindEval: A Simple But Scalable Reasoning Benchmark
Figure 2 for MastermindEval: A Simple But Scalable Reasoning Benchmark
Figure 3 for MastermindEval: A Simple But Scalable Reasoning Benchmark
Figure 4 for MastermindEval: A Simple But Scalable Reasoning Benchmark

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: