Picture for R. Srikant

R. Srikant

Performance of NPG in Countable State-Space Average-Cost RL

May 30, 2024
Viaarxiv icon

On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes

Mar 11, 2024
Figure 1 for On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes
Figure 2 for On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes
Figure 3 for On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes
Figure 4 for On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes
Viaarxiv icon

Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization

Feb 15, 2024
Viaarxiv icon

Convergence for Natural Policy Gradient on Infinite-State Average-Reward Markov Decision Processes

Feb 07, 2024
Viaarxiv icon

Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

Jan 28, 2024
Viaarxiv icon

Cascading Reinforcement Learning

Jan 17, 2024
Viaarxiv icon

Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression

Sep 19, 2023
Figure 1 for Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression
Figure 2 for Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression
Figure 3 for Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression
Figure 4 for Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression
Viaarxiv icon

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

May 30, 2023
Figure 1 for Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits
Figure 2 for Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits
Viaarxiv icon

A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games

Mar 17, 2023
Viaarxiv icon

Performance Bounds for Policy-Based Average Reward Reinforcement Learning Algorithms

Feb 15, 2023
Viaarxiv icon