Picture for Dongsheng Ding

Dongsheng Ding

Alignment of large language models with constrained learning

Add code
May 26, 2025
Viaarxiv icon

Constrained Diffusion Models via Dual Training

Add code
Aug 27, 2024
Viaarxiv icon

Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs

Add code
Aug 19, 2024
Viaarxiv icon

One-Shot Safety Alignment for Large Language Models via Optimal Dualization

Add code
May 29, 2024
Viaarxiv icon

Resilient Constrained Reinforcement Learning

Add code
Dec 29, 2023
Figure 1 for Resilient Constrained Reinforcement Learning
Figure 2 for Resilient Constrained Reinforcement Learning
Figure 3 for Resilient Constrained Reinforcement Learning
Figure 4 for Resilient Constrained Reinforcement Learning
Viaarxiv icon

Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs

Add code
Jun 20, 2023
Viaarxiv icon

Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning

Add code
May 31, 2023
Viaarxiv icon

Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

Add code
Jun 06, 2022
Figure 1 for Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Figure 2 for Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Figure 3 for Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Figure 4 for Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Viaarxiv icon

Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence

Add code
Feb 08, 2022
Figure 1 for Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Figure 2 for Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Figure 3 for Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Figure 4 for Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Viaarxiv icon

Provably Efficient Safe Exploration via Primal-Dual Policy Optimization

Add code
Mar 01, 2020
Viaarxiv icon