Picture for Yutao Hou

Yutao Hou

Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks

Add code
Aug 06, 2025
Figure 1 for Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks
Figure 2 for Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks
Figure 3 for Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks
Figure 4 for Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks
Viaarxiv icon

Automatic Robustness Stress Testing of LLMs as Mathematical Problem Solvers

Add code
Jun 05, 2025
Viaarxiv icon

Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions

Add code
Nov 15, 2024
Figure 1 for Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions
Figure 2 for Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions
Figure 3 for Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions
Figure 4 for Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions
Viaarxiv icon