Picture for Baoxiang Wang

Baoxiang Wang

Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning

Add code
Dec 28, 2025
Viaarxiv icon

Trust Region Masking for Long-Horizon LLM Reinforcement Learning

Add code
Dec 28, 2025
Viaarxiv icon

Policy-Conditioned Policies for Multi-Agent Task Solving

Add code
Dec 24, 2025
Figure 1 for Policy-Conditioned Policies for Multi-Agent Task Solving
Figure 2 for Policy-Conditioned Policies for Multi-Agent Task Solving
Figure 3 for Policy-Conditioned Policies for Multi-Agent Task Solving
Viaarxiv icon

Reinforcement Learning for Target Zone Blood Glucose Control

Add code
Aug 05, 2025
Viaarxiv icon

Information Bargaining: Bilateral Commitment in Bayesian Persuasion

Add code
Jun 09, 2025
Viaarxiv icon

Bayesian Persuasion as a Bargaining Game

Add code
Jun 06, 2025
Viaarxiv icon

ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning

Add code
May 29, 2025
Viaarxiv icon

Learning to Negotiate via Voluntary Commitment

Add code
Mar 05, 2025
Figure 1 for Learning to Negotiate via Voluntary Commitment
Figure 2 for Learning to Negotiate via Voluntary Commitment
Figure 3 for Learning to Negotiate via Voluntary Commitment
Figure 4 for Learning to Negotiate via Voluntary Commitment
Viaarxiv icon

Verbalized Bayesian Persuasion

Add code
Feb 03, 2025
Figure 1 for Verbalized Bayesian Persuasion
Figure 2 for Verbalized Bayesian Persuasion
Figure 3 for Verbalized Bayesian Persuasion
Figure 4 for Verbalized Bayesian Persuasion
Viaarxiv icon

On the Decomposition of Differential Game

Add code
Nov 06, 2024
Viaarxiv icon