Picture for Guojun Yin

Guojun Yin

CDRRM: Contrast-Driven Rubric Generation for Reliable and Interpretable Reward Modeling

Add code
Mar 09, 2026
Viaarxiv icon

SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

Add code
Mar 03, 2026
Viaarxiv icon

Contextual Rollout Bandits for Reinforcement Learning with Verifiable Rewards

Add code
Feb 09, 2026
Viaarxiv icon

Your Group-Relative Advantage Is Biased

Add code
Jan 13, 2026
Viaarxiv icon

Beyond Dialogue Time: Temporal Semantic Memory for Personalized LLM Agents

Add code
Jan 12, 2026
Viaarxiv icon

AWPO: Enhancing Tool-Use of Large Language Models through Explicit Integration of Reasoning Rewards

Add code
Dec 23, 2025
Viaarxiv icon

ToolForge: A Data Synthesis Pipeline for Multi-Hop Search without Real-World APIs

Add code
Dec 18, 2025
Viaarxiv icon

LocalSearchBench: Benchmarking Agentic Search in Real-World Local Life Services

Add code
Dec 08, 2025
Viaarxiv icon

From Experience to Strategy: Empowering LLM Agents with Trainable Graph Memory

Add code
Nov 11, 2025
Viaarxiv icon

Promoting Efficient Reasoning with Verifiable Stepwise Reward

Add code
Aug 14, 2025
Viaarxiv icon