Picture for Diego Mares

Diego Mares

PRBench: Large-Scale Expert Rubrics for Evaluating High-Stakes Professional Reasoning

Add code
Nov 14, 2025
Viaarxiv icon

MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs

Add code
Jul 23, 2025
Figure 1 for MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
Figure 2 for MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
Figure 3 for MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
Figure 4 for MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
Viaarxiv icon