Picture for Le Sun

Le Sun

P^2O: Joint Policy and Prompt Optimization

Add code
Mar 23, 2026
Viaarxiv icon

Tackling Length Inflation Without Trade-offs: Group Relative Reward Rescaling for Reinforcement Learning

Add code
Mar 11, 2026
Viaarxiv icon

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

Add code
Mar 10, 2026
Viaarxiv icon

DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation

Add code
Feb 26, 2026
Viaarxiv icon

Beyond Local Edits: Embedding-Virtualized Knowledge for Broader Evaluation and Preservation of Model Editing

Add code
Feb 02, 2026
Viaarxiv icon

Will It Zero-Shot?: Predicting Zero-Shot Classification Performance For Arbitrary Queries

Add code
Jan 27, 2026
Viaarxiv icon

Will It Zero-Shot?: Will It Zero-Shot?: Predicting Zero-Shot Classification Performance For Arbitrary Queries

Add code
Jan 24, 2026
Viaarxiv icon

Coupled Variational Reinforcement Learning for Language Model General Reasoning

Add code
Dec 14, 2025
Viaarxiv icon

AI-Salesman: Towards Reliable Large Language Model Driven Telemarketing

Add code
Nov 15, 2025
Viaarxiv icon

RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing

Add code
Jul 27, 2025
Viaarxiv icon