Picture for Chonghuan Wang

Chonghuan Wang

ShapE-GRPO: Shapley-Enhanced Reward Allocation for Multi-Candidate LLM Training

Add code
Mar 31, 2026
Viaarxiv icon

Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?

Add code
Jul 16, 2025
Viaarxiv icon

Experimenting on Markov Decision Processes with Local Treatments

Add code
Jul 29, 2024
Figure 1 for Experimenting on Markov Decision Processes with Local Treatments
Figure 2 for Experimenting on Markov Decision Processes with Local Treatments
Viaarxiv icon