Picture for Aditya Mahajan

Aditya Mahajan

Periodic agent-state based Q-learning for POMDPs

Add code
Jul 08, 2024
Viaarxiv icon

Model approximation in MDPs with unbounded per-step cost

Add code
Feb 13, 2024
Viaarxiv icon

Bridging State and History Representations: Understanding Self-Predictive RL

Add code
Jan 17, 2024
Figure 1 for Bridging State and History Representations: Understanding Self-Predictive RL
Figure 2 for Bridging State and History Representations: Understanding Self-Predictive RL
Figure 3 for Bridging State and History Representations: Understanding Self-Predictive RL
Figure 4 for Bridging State and History Representations: Understanding Self-Predictive RL
Viaarxiv icon

Approximate information state based convergence analysis of recurrent Q-learning

Add code
Jun 09, 2023
Figure 1 for Approximate information state based convergence analysis of recurrent Q-learning
Figure 2 for Approximate information state based convergence analysis of recurrent Q-learning
Figure 3 for Approximate information state based convergence analysis of recurrent Q-learning
Figure 4 for Approximate information state based convergence analysis of recurrent Q-learning
Viaarxiv icon

Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning

Add code
Feb 06, 2023
Figure 1 for Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning
Figure 2 for Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning
Figure 3 for Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning
Figure 4 for Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning
Viaarxiv icon

On learning history based policies for controlling Markov decision processes

Add code
Nov 06, 2022
Figure 1 for On learning history based policies for controlling Markov decision processes
Figure 2 for On learning history based policies for controlling Markov decision processes
Figure 3 for On learning history based policies for controlling Markov decision processes
Figure 4 for On learning history based policies for controlling Markov decision processes
Viaarxiv icon

On learning Whittle index policy for restless bandits with scalable regret

Add code
Feb 07, 2022
Figure 1 for On learning Whittle index policy for restless bandits with scalable regret
Figure 2 for On learning Whittle index policy for restless bandits with scalable regret
Figure 3 for On learning Whittle index policy for restless bandits with scalable regret
Viaarxiv icon

Consistency and Rate of Convergence of Switched Least Squares System Identification for Autonomous Switched Linear Systems

Add code
Dec 20, 2021
Figure 1 for Consistency and Rate of Convergence of Switched Least Squares System Identification for Autonomous Switched Linear Systems
Viaarxiv icon

Scalable Operator Allocation for Multi-Robot Assistance: A Restless Bandit Approach

Add code
Nov 11, 2021
Figure 1 for Scalable Operator Allocation for Multi-Robot Assistance: A Restless Bandit Approach
Figure 2 for Scalable Operator Allocation for Multi-Robot Assistance: A Restless Bandit Approach
Figure 3 for Scalable Operator Allocation for Multi-Robot Assistance: A Restless Bandit Approach
Figure 4 for Scalable Operator Allocation for Multi-Robot Assistance: A Restless Bandit Approach
Viaarxiv icon

A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Add code
Aug 19, 2021
Figure 1 for A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems
Figure 2 for A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems
Figure 3 for A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems
Viaarxiv icon