Picture for Yunfan Zhou

Yunfan Zhou

SPADER: Step-wise Peer Advantage with Diversity-Aware Exploration Rewards for Multi-Answer Question Answering

Add code
May 30, 2026
Viaarxiv icon

Offline Reinforcement Learning with Adaptive Behavior Regularization

Add code
Nov 15, 2022
Figure 1 for Offline Reinforcement Learning with Adaptive Behavior Regularization
Figure 2 for Offline Reinforcement Learning with Adaptive Behavior Regularization
Figure 3 for Offline Reinforcement Learning with Adaptive Behavior Regularization
Figure 4 for Offline Reinforcement Learning with Adaptive Behavior Regularization
Viaarxiv icon

Yordle: An Efficient Imitation Learning for Branch and Bound

Add code
Feb 02, 2022
Figure 1 for Yordle: An Efficient Imitation Learning for Branch and Bound
Figure 2 for Yordle: An Efficient Imitation Learning for Branch and Bound
Figure 3 for Yordle: An Efficient Imitation Learning for Branch and Bound
Figure 4 for Yordle: An Efficient Imitation Learning for Branch and Bound
Viaarxiv icon

An Improved Reinforcement Learning Algorithm for Learning to Branch

Add code
Jan 17, 2022
Viaarxiv icon