Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for A. Rupam Mahmood

Model-free Policy Learning with Reward Gradients


Mar 09, 2021
Qingfeng Lan, A. Rupam Mahmood


  Access Paper or Ask Questions

Autoregressive Policies for Continuous Control Deep Reinforcement Learning


Mar 27, 2019
Dmytro Korenkevych, A. Rupam Mahmood, Gautham Vasan, James Bergstra

* Submitted to 28th International Joint Conference on Artificial Intelligence (IJCAI 2019). Video: https://youtu.be/NCpyXBNqNmw Code: https://github.com/dkorenkevych/arp 

  Access Paper or Ask Questions

Benchmarking Reinforcement Learning Algorithms on Real-World Robots


Sep 20, 2018
A. Rupam Mahmood, Dmytro Korenkevych, Gautham Vasan, William Ma, James Bergstra

* Appears in Proceedings of the Second Conference on Robot Learning (CoRL 2018). Companion video at https://youtu.be/ovDfhvjpQd8 and source code at https://github.com/kindredresearch/SenseAct 

  Access Paper or Ask Questions

Setting up a Reinforcement Learning Task with a Real-World Robot


Mar 19, 2018
A. Rupam Mahmood, Dmytro Korenkevych, Brent J. Komer, James Bergstra

* Submitted to 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 

  Access Paper or Ask Questions

True Online Temporal-Difference Learning


Sep 08, 2016
Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Marlos C. Machado, Richard S. Sutton

* Journal of Machine Learning Research (JMLR), 17(145):1-40, 2016 
* This is the published JMLR version. It is a much improved version. The main changes are: 1) re-structuring of the article; 2) additional analysis on the forward view; 3) empirical comparison of traditional and new forward view; 4) added discussion of other true online papers; 5) updated discussion for non-linear function approximation 

  Access Paper or Ask Questions

Emphatic Temporal-Difference Learning


Jul 06, 2015
A. Rupam Mahmood, Huizhen Yu, Martha White, Richard S. Sutton

* 9 pages, accepted for presentation at European Workshop on Reinforcement Learning 

  Access Paper or Ask Questions

An Empirical Evaluation of True Online TD(位)


Jul 01, 2015
Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Richard S. Sutton

* European Workshop on Reinforcement Learning (EWRL) 2015 

  Access Paper or Ask Questions

An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning


Apr 21, 2015
Richard S. Sutton, A. Rupam Mahmood, Martha White

* Journal of Machine Learning Research 17(73): 1-29, 2016 
* 29 pages This is a significant revision based on the first set of reviews. The most important change was to signal early that the main result is about stability, not convergence 

  Access Paper or Ask Questions