Picture for Yanghua Xiao

Yanghua Xiao

Outcome-Grounded Advantage Reshaping for Fine-Grained Credit Assignment in Mathematical Reasoning

Add code
Jan 12, 2026
Viaarxiv icon

Structured Reasoning for Large Language Models

Add code
Jan 12, 2026
Viaarxiv icon

LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Add code
Jan 10, 2026
Viaarxiv icon

Logics-STEM: Empowering LLM Reasoning via Failure-Driven Post-Training and Document Knowledge Enhancement

Add code
Jan 08, 2026
Viaarxiv icon

Why Did Apple Fall To The Ground: Evaluating Curiosity In Large Language Model

Add code
Oct 23, 2025
Viaarxiv icon

Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following

Add code
Oct 16, 2025
Viaarxiv icon

HINT: Helping Ineffective Rollouts Navigate Towards Effectiveness

Add code
Oct 10, 2025
Viaarxiv icon

Accelerated Evolving Set Processes for Local PageRank Computation

Add code
Oct 09, 2025
Figure 1 for Accelerated Evolving Set Processes for Local PageRank Computation
Figure 2 for Accelerated Evolving Set Processes for Local PageRank Computation
Figure 3 for Accelerated Evolving Set Processes for Local PageRank Computation
Figure 4 for Accelerated Evolving Set Processes for Local PageRank Computation
Viaarxiv icon

CultureScope: A Dimensional Lens for Probing Cultural Understanding in LLMs

Add code
Sep 19, 2025
Viaarxiv icon

Curse of Knowledge: When Complex Evaluation Context Benefits yet Biases LLM Judges

Add code
Sep 03, 2025
Viaarxiv icon