Picture for Runyu Zhang

Runyu Zhang

Cooperative Multi-Agent Graph Bandits: UCB Algorithm and Regret Analysis

Add code
Jan 18, 2024
Viaarxiv icon

Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity

Add code
Jun 27, 2023
Figure 1 for Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity
Figure 2 for Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity
Figure 3 for Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity
Figure 4 for Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity
Viaarxiv icon

Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling

Add code
Feb 28, 2023
Figure 1 for Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling
Figure 2 for Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling
Figure 3 for Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling
Figure 4 for Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling
Viaarxiv icon

Policy Optimization for Markov Games: Unified Framework and Faster Convergence

Add code
Jun 06, 2022
Figure 1 for Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Figure 2 for Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Viaarxiv icon

Gradient Play in Multi-Agent Markov Stochastic Games: Stationary Points and Convergence

Add code
Jun 17, 2021
Figure 1 for Gradient Play in Multi-Agent Markov Stochastic Games: Stationary Points and Convergence
Figure 2 for Gradient Play in Multi-Agent Markov Stochastic Games: Stationary Points and Convergence
Viaarxiv icon

Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach

Add code
Feb 04, 2020
Figure 1 for Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach
Viaarxiv icon