Picture for Javad Lavaei

Javad Lavaei

Max

A CMDP-within-online framework for Meta-Safe Reinforcement Learning

Add code
May 26, 2024
Viaarxiv icon

Pausing Policy Learning in Non-stationary Reinforcement Learning

Add code
May 25, 2024
Viaarxiv icon

Absence of spurious solutions far from ground truth: A low-rank analysis with high-order losses

Add code
Mar 10, 2024
Figure 1 for Absence of spurious solutions far from ground truth: A low-rank analysis with high-order losses
Figure 2 for Absence of spurious solutions far from ground truth: A low-rank analysis with high-order losses
Figure 3 for Absence of spurious solutions far from ground truth: A low-rank analysis with high-order losses
Figure 4 for Absence of spurious solutions far from ground truth: A low-rank analysis with high-order losses
Viaarxiv icon

Algorithmic Regularization in Tensor Optimization: Towards a Lifted Approach in Matrix Sensing

Add code
Oct 24, 2023
Viaarxiv icon

Tempo Adaption in Non-stationary Reinforcement Learning

Add code
Sep 26, 2023
Figure 1 for Tempo Adaption in Non-stationary Reinforcement Learning
Figure 2 for Tempo Adaption in Non-stationary Reinforcement Learning
Figure 3 for Tempo Adaption in Non-stationary Reinforcement Learning
Figure 4 for Tempo Adaption in Non-stationary Reinforcement Learning
Viaarxiv icon

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

Add code
May 27, 2023
Figure 1 for Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities
Figure 2 for Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities
Figure 3 for Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities
Figure 4 for Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities
Viaarxiv icon

Exact Recovery for System Identification with More Corrupt Data than Clean Data

Add code
May 17, 2023
Figure 1 for Exact Recovery for System Identification with More Corrupt Data than Clean Data
Figure 2 for Exact Recovery for System Identification with More Corrupt Data than Clean Data
Figure 3 for Exact Recovery for System Identification with More Corrupt Data than Clean Data
Viaarxiv icon

Scalable Multi-Agent Reinforcement Learning with General Utilities

Add code
Feb 15, 2023
Viaarxiv icon

Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points

Add code
Feb 15, 2023
Figure 1 for Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Figure 2 for Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Figure 3 for Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Viaarxiv icon

Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design

Add code
Nov 19, 2022
Figure 1 for Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design
Figure 2 for Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design
Figure 3 for Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design
Viaarxiv icon