Picture for Yanghua Xiao

Yanghua Xiao

RubricEval: A Rubric-Level Meta-Evaluation Benchmark for LLM Judges in Instruction Following

Add code
Mar 26, 2026
Viaarxiv icon

From AI Assistant to AI Scientist: Autonomous Discovery of LLM-RL Algorithms with LLM Agents

Add code
Mar 25, 2026
Viaarxiv icon

Thinking with Constructions: A Benchmark and Policy Optimization for Visual-Text Interleaved Geometric Reasoning

Add code
Mar 19, 2026
Viaarxiv icon

DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use

Add code
Mar 10, 2026
Viaarxiv icon

CODA: Difficulty-Aware Compute Allocation for Adaptive Reasoning

Add code
Mar 09, 2026
Viaarxiv icon

HER: Human-like Reasoning and Reinforcement Learning for LLM Role-playing

Add code
Jan 29, 2026
Viaarxiv icon

HUMANLLM: Benchmarking and Reinforcing LLM Anthropomorphism via Human Cognitive Patterns

Add code
Jan 15, 2026
Viaarxiv icon

Outcome-Grounded Advantage Reshaping for Fine-Grained Credit Assignment in Mathematical Reasoning

Add code
Jan 12, 2026
Viaarxiv icon

Structured Reasoning for Large Language Models

Add code
Jan 12, 2026
Viaarxiv icon

LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Add code
Jan 10, 2026
Viaarxiv icon