Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Jul 03, 2020

Cyrus Neary, Zhe Xu, Bo Wu, Ufuk Topcu

Figure 1 for Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Figure 2 for Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Figure 3 for Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Figure 4 for Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Share this with someone who'll enjoy it:

Abstract:In cooperative multi-agent reinforcement learning, a collection of agents learns to interact in a shared environment to achieve a common goal. We propose the use of reward machines (RM) -- Mealy machines used as structured representations of reward functions -- to encode the team's task. The proposed novel interpretation of RMs in the multi-agent setting explicitly encodes required teammate interdependencies and independencies, allowing the team-level task to be decomposed into sub-tasks for individual agents. We define such a notion of RM decomposition and present algorithmically verifiable conditions guaranteeing that distributed completion of the sub-tasks leads to team behavior accomplishing the original task. This framework for task decomposition provides a natural approach to decentralized learning: agents may learn to accomplish their sub-tasks while observing only their local state and abstracted representations of their teammates. We accordingly propose a decentralized q-learning algorithm. Furthermore, in the case of undiscounted rewards, we use local value functions to derive lower and upper bounds for the global value function corresponding to the team task. Experimental results in three discrete settings exemplify the effectiveness of the proposed RM decomposition approach, which converges to a successful team policy two orders of magnitude faster than a centralized learner and significantly outperforms hierarchical and independent q-learning approaches.

View paper on

Share this with someone who'll enjoy it:

Title:Reward Machines for Cooperative Multi-Agent Reinforcement Learning

Paper and Code