Picture for Yangchen Pan

Yangchen Pan

Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination

Add code
May 28, 2024
Viaarxiv icon

DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime

Add code
May 28, 2024
Viaarxiv icon

An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models

Apr 23, 2024
Viaarxiv icon

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

Mar 20, 2024
Figure 1 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 2 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 3 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 4 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Viaarxiv icon

Improving Adversarial Transferability via Model Alignment

Add code
Nov 30, 2023
Viaarxiv icon

Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods

Add code
Aug 13, 2023
Figure 1 for Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
Figure 2 for Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
Figure 3 for Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
Figure 4 for Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
Viaarxiv icon

An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient

Add code
Aug 09, 2023
Figure 1 for An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Figure 2 for An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Figure 3 for An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Figure 4 for An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Viaarxiv icon

Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning

Add code
Mar 16, 2023
Figure 1 for Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning
Figure 2 for Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning
Figure 3 for Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning
Figure 4 for Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning
Viaarxiv icon

The In-Sample Softmax for Offline Reinforcement Learning

Add code
Feb 28, 2023
Figure 1 for The In-Sample Softmax for Offline Reinforcement Learning
Figure 2 for The In-Sample Softmax for Offline Reinforcement Learning
Figure 3 for The In-Sample Softmax for Offline Reinforcement Learning
Figure 4 for The In-Sample Softmax for Offline Reinforcement Learning
Viaarxiv icon

Label Alignment Regularization for Distribution Shift

Nov 27, 2022
Figure 1 for Label Alignment Regularization for Distribution Shift
Figure 2 for Label Alignment Regularization for Distribution Shift
Figure 3 for Label Alignment Regularization for Distribution Shift
Figure 4 for Label Alignment Regularization for Distribution Shift
Viaarxiv icon