Picture for Shangtong Zhang

Shangtong Zhang

Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models

Add code
Mar 10, 2025
Viaarxiv icon

Group Fairness in Multi-Task Reinforcement Learning

Add code
Mar 10, 2025
Viaarxiv icon

A Survey of In-Context Reinforcement Learning

Add code
Feb 11, 2025
Viaarxiv icon

Linear $Q$-Learning Does Not Diverge: Convergence Rates to a Bounded Set

Add code
Jan 31, 2025
Viaarxiv icon

CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening

Add code
Nov 26, 2024
Figure 1 for CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening
Figure 2 for CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening
Figure 3 for CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening
Figure 4 for CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening
Viaarxiv icon

Almost Sure Convergence Rates and Concentration of Stochastic Approximation and Reinforcement Learning with Markovian Noise

Add code
Nov 20, 2024
Viaarxiv icon

Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning

Add code
Oct 08, 2024
Figure 1 for Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Figure 2 for Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Figure 3 for Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Figure 4 for Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Viaarxiv icon

Doubly Optimal Policy Evaluation for Reinforcement Learning

Add code
Oct 03, 2024
Viaarxiv icon

Almost Sure Convergence of Average Reward Temporal Difference Learning

Add code
Sep 29, 2024
Viaarxiv icon

Almost Sure Convergence of Linear Temporal Difference Learning with Arbitrary Features

Add code
Sep 18, 2024
Viaarxiv icon