Picture for Yongbin Li

Yongbin Li

TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence

Add code
May 30, 2025
Viaarxiv icon

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

Add code
May 29, 2025
Viaarxiv icon

Socratic-PRMBench: Benchmarking Process Reward Models with Systematic Reasoning Patterns

Add code
May 29, 2025
Viaarxiv icon

Reverse Preference Optimization for Complex Instruction Following

Add code
May 28, 2025
Viaarxiv icon

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction

Add code
May 26, 2025
Viaarxiv icon

Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents

Add code
May 04, 2025
Viaarxiv icon

Supervised Optimism Correction: Be Confident When LLMs Are Sure

Add code
Apr 10, 2025
Viaarxiv icon

Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute

Add code
Mar 31, 2025
Viaarxiv icon

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

Add code
Feb 28, 2025
Viaarxiv icon

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

Add code
Feb 18, 2025
Viaarxiv icon