Alert button
Picture for Philip S. Thomas

Philip S. Thomas

Alert button

Optimizing for the Future in Non-Stationary MDPs

Add code
Bookmark button
Alert button
May 17, 2020
Yash Chandak, Georgios Theocharous, Shiv Shankar, Sridhar Mahadevan, Martha White, Philip S. Thomas

Figure 1 for Optimizing for the Future in Non-Stationary MDPs
Figure 2 for Optimizing for the Future in Non-Stationary MDPs
Figure 3 for Optimizing for the Future in Non-Stationary MDPs
Figure 4 for Optimizing for the Future in Non-Stationary MDPs
Viaarxiv icon

Learning Reusable Options for Multi-Task Reinforcement Learning

Add code
Bookmark button
Alert button
Jan 06, 2020
Francisco M. Garcia, Chris Nota, Philip S. Thomas

Figure 1 for Learning Reusable Options for Multi-Task Reinforcement Learning
Figure 2 for Learning Reusable Options for Multi-Task Reinforcement Learning
Figure 3 for Learning Reusable Options for Multi-Task Reinforcement Learning
Figure 4 for Learning Reusable Options for Multi-Task Reinforcement Learning
Viaarxiv icon

Reinforcement learning with a network of spiking agents

Add code
Bookmark button
Alert button
Nov 10, 2019
Sneha Aenugu, Abhishek Sharma, Sasikiran Yelamarthi, Hananel Hazan, Philip S. Thomas, Robert Kozma

Figure 1 for Reinforcement learning with a network of spiking agents
Figure 2 for Reinforcement learning with a network of spiking agents
Viaarxiv icon

Reinforcement learning with spiking coagents

Add code
Bookmark button
Alert button
Oct 31, 2019
Sneha Aenugu, Abhishek Sharma, Sasikiran Yelamarthi, Hananel Hazan, Philip S. Thomas, Robert Kozma

Figure 1 for Reinforcement learning with spiking coagents
Figure 2 for Reinforcement learning with spiking coagents
Viaarxiv icon

Is the Policy Gradient a Gradient?

Add code
Bookmark button
Alert button
Jun 17, 2019
Chris Nota, Philip S. Thomas

Figure 1 for Is the Policy Gradient a Gradient?
Figure 2 for Is the Policy Gradient a Gradient?
Figure 3 for Is the Policy Gradient a Gradient?
Viaarxiv icon

Classical Policy Gradient: Preserving Bellman's Principle of Optimality

Add code
Bookmark button
Alert button
Jun 06, 2019
Philip S. Thomas, Scott M. Jordan, Yash Chandak, Chris Nota, James Kostas

Viaarxiv icon

Reinforcement Learning When All Actions are Not Always Available

Add code
Bookmark button
Alert button
Jun 05, 2019
Yash Chandak, Georgios Theocharous, Blossom Metevier, Philip S. Thomas

Figure 1 for Reinforcement Learning When All Actions are Not Always Available
Figure 2 for Reinforcement Learning When All Actions are Not Always Available
Figure 3 for Reinforcement Learning When All Actions are Not Always Available
Figure 4 for Reinforcement Learning When All Actions are Not Always Available
Viaarxiv icon

Lifelong Learning with a Changing Action Set

Add code
Bookmark button
Alert button
Jun 05, 2019
Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas

Figure 1 for Lifelong Learning with a Changing Action Set
Figure 2 for Lifelong Learning with a Changing Action Set
Figure 3 for Lifelong Learning with a Changing Action Set
Figure 4 for Lifelong Learning with a Changing Action Set
Viaarxiv icon

A New Confidence Interval for the Mean of a Bounded Random Variable

Add code
Bookmark button
Alert button
May 15, 2019
Erik Learned-Miller, Philip S. Thomas

Figure 1 for A New Confidence Interval for the Mean of a Bounded Random Variable
Figure 2 for A New Confidence Interval for the Mean of a Bounded Random Variable
Figure 3 for A New Confidence Interval for the Mean of a Bounded Random Variable
Figure 4 for A New Confidence Interval for the Mean of a Bounded Random Variable
Viaarxiv icon