Alert button
Picture for Dale Schuurmans

Dale Schuurmans

Alert button

Joint Attention for Multi-Agent Coordination and Social Learning

Add code
Bookmark button
Alert button
Apr 15, 2021
Dennis Lee, Natasha Jaques, Chase Kew, Douglas Eck, Dale Schuurmans, Aleksandra Faust

Figure 1 for Joint Attention for Multi-Agent Coordination and Social Learning
Figure 2 for Joint Attention for Multi-Agent Coordination and Social Learning
Figure 3 for Joint Attention for Multi-Agent Coordination and Social Learning
Figure 4 for Joint Attention for Multi-Agent Coordination and Social Learning
Viaarxiv icon

On the Optimality of Batch Policy Optimization Algorithms

Add code
Bookmark button
Alert button
Apr 06, 2021
Chenjun Xiao, Yifan Wu, Tor Lattimore, Bo Dai, Jincheng Mei, Lihong Li, Csaba Szepesvari, Dale Schuurmans

Figure 1 for On the Optimality of Batch Policy Optimization Algorithms
Figure 2 for On the Optimality of Batch Policy Optimization Algorithms
Viaarxiv icon

Optimization Issues in KL-Constrained Approximate Policy Iteration

Add code
Bookmark button
Alert button
Feb 11, 2021
Nevena Lazić, Botao Hao, Yasin Abbasi-Yadkori, Dale Schuurmans, Csaba Szepesvári

Figure 1 for Optimization Issues in KL-Constrained Approximate Policy Iteration
Figure 2 for Optimization Issues in KL-Constrained Approximate Policy Iteration
Figure 3 for Optimization Issues in KL-Constrained Approximate Policy Iteration
Figure 4 for Optimization Issues in KL-Constrained Approximate Policy Iteration
Viaarxiv icon

Offline Policy Selection under Uncertainty

Add code
Bookmark button
Alert button
Dec 12, 2020
Mengjiao Yang, Bo Dai, Ofir Nachum, George Tucker, Dale Schuurmans

Figure 1 for Offline Policy Selection under Uncertainty
Figure 2 for Offline Policy Selection under Uncertainty
Figure 3 for Offline Policy Selection under Uncertainty
Figure 4 for Offline Policy Selection under Uncertainty
Viaarxiv icon

Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration

Add code
Bookmark button
Alert button
Nov 10, 2020
Hanjun Dai, Rishabh Singh, Bo Dai, Charles Sutton, Dale Schuurmans

Figure 1 for Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration
Figure 2 for Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration
Figure 3 for Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration
Figure 4 for Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration
Viaarxiv icon

CoinDICE: Off-Policy Confidence Interval Estimation

Add code
Bookmark button
Alert button
Oct 22, 2020
Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans

Figure 1 for CoinDICE: Off-Policy Confidence Interval Estimation
Figure 2 for CoinDICE: Off-Policy Confidence Interval Estimation
Figure 3 for CoinDICE: Off-Policy Confidence Interval Estimation
Viaarxiv icon

Attention that does not Explain Away

Add code
Bookmark button
Alert button
Sep 29, 2020
Nan Ding, Xinjie Fan, Zhenzhong Lan, Dale Schuurmans, Radu Soricut

Figure 1 for Attention that does not Explain Away
Figure 2 for Attention that does not Explain Away
Figure 3 for Attention that does not Explain Away
Figure 4 for Attention that does not Explain Away
Viaarxiv icon

EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

Add code
Bookmark button
Alert button
Jul 21, 2020
Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Shane Gu

Figure 1 for EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Figure 2 for EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Figure 3 for EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Figure 4 for EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Viaarxiv icon

Off-Policy Evaluation via the Regularized Lagrangian

Add code
Bookmark button
Alert button
Jul 07, 2020
Mengjiao Yang, Ofir Nachum, Bo Dai, Lihong Li, Dale Schuurmans

Figure 1 for Off-Policy Evaluation via the Regularized Lagrangian
Figure 2 for Off-Policy Evaluation via the Regularized Lagrangian
Figure 3 for Off-Policy Evaluation via the Regularized Lagrangian
Figure 4 for Off-Policy Evaluation via the Regularized Lagrangian
Viaarxiv icon

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Add code
Bookmark button
Alert button
Jul 01, 2020
Denny Zhou, Mao Ye, Chen Chen, Tianjian Meng, Mingxing Tan, Xiaodan Song, Quoc Le, Qiang Liu, Dale Schuurmans

Figure 1 for Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Figure 2 for Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Figure 3 for Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Figure 4 for Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Viaarxiv icon