Alert button
Picture for Yunhao Tang

Yunhao Tang

Alert button

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm

Add code
Bookmark button
Alert button
May 29, 2023
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko

Figure 1 for DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
Figure 2 for DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
Figure 3 for DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
Figure 4 for DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
Viaarxiv icon

Towards a Better Understanding of Representation Dynamics under TD-learning

Add code
Bookmark button
Alert button
May 29, 2023
Yunhao Tang, Rémi Munos

Figure 1 for Towards a Better Understanding of Representation Dynamics under TD-learning
Figure 2 for Towards a Better Understanding of Representation Dynamics under TD-learning
Figure 3 for Towards a Better Understanding of Representation Dynamics under TD-learning
Figure 4 for Towards a Better Understanding of Representation Dynamics under TD-learning
Viaarxiv icon

The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation

Add code
Bookmark button
Alert button
May 28, 2023
Mark Rowland, Yunhao Tang, Clare Lyle, Rémi Munos, Marc G. Bellemare, Will Dabney

Figure 1 for The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Figure 2 for The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Figure 3 for The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Figure 4 for The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Viaarxiv icon

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

Add code
Bookmark button
Alert button
May 22, 2023
Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo

Figure 1 for Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
Figure 2 for Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
Figure 3 for Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
Figure 4 for Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
Viaarxiv icon

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

Add code
Bookmark button
Alert button
May 02, 2023
Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Remi Munos, Will Dabney, Diana L Borsa

Figure 1 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 2 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 3 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 4 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Viaarxiv icon

Fast Rates for Maximum Entropy Exploration

Add code
Bookmark button
Alert button
Mar 14, 2023
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Menard

Figure 1 for Fast Rates for Maximum Entropy Exploration
Figure 2 for Fast Rates for Maximum Entropy Exploration
Figure 3 for Fast Rates for Maximum Entropy Exploration
Figure 4 for Fast Rates for Maximum Entropy Exploration
Viaarxiv icon

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

Add code
Bookmark button
Alert button
Feb 09, 2023
Pierre H. Richemond, Allison Tam, Yunhao Tang, Florian Strub, Bilal Piot, Felix Hill

Figure 1 for The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Figure 2 for The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Figure 3 for The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Figure 4 for The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Viaarxiv icon

An Analysis of Quantile Temporal-Difference Learning

Add code
Bookmark button
Alert button
Jan 11, 2023
Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney

Figure 1 for An Analysis of Quantile Temporal-Difference Learning
Figure 2 for An Analysis of Quantile Temporal-Difference Learning
Figure 3 for An Analysis of Quantile Temporal-Difference Learning
Figure 4 for An Analysis of Quantile Temporal-Difference Learning
Viaarxiv icon

Understanding Self-Predictive Learning for Reinforcement Learning

Add code
Bookmark button
Alert button
Dec 06, 2022
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko

Figure 1 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 2 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 3 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 4 for Understanding Self-Predictive Learning for Reinforcement Learning
Viaarxiv icon

The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning

Add code
Bookmark button
Alert button
Jul 15, 2022
Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare

Figure 1 for The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Figure 2 for The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Figure 3 for The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Figure 4 for The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Viaarxiv icon