Picture for Qianjia Cheng

Qianjia Cheng

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Add code
Sep 10, 2025
Figure 1 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 2 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 3 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 4 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Viaarxiv icon

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics

Add code
Aug 25, 2025
Figure 1 for CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics
Figure 2 for CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics
Figure 3 for CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics
Figure 4 for CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics
Viaarxiv icon