Alert button
Picture for Mohammad Ghavamzadeh

Mohammad Ghavamzadeh

Alert button

Aligning Text-to-Image Models using Human Feedback

Feb 23, 2023
Kimin Lee, Hao Liu, Moonkyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu

Figure 1 for Aligning Text-to-Image Models using Human Feedback
Figure 2 for Aligning Text-to-Image Models using Human Feedback
Figure 3 for Aligning Text-to-Image Models using Human Feedback
Figure 4 for Aligning Text-to-Image Models using Human Feedback
Viaarxiv icon

Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Feb 21, 2023
Dhawal Gupta, Yinlam Chow, Mohammad Ghavamzadeh, Craig Boutilier

Figure 1 for Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Figure 2 for Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Figure 3 for Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Figure 4 for Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Viaarxiv icon

Multi-Task Off-Policy Learning from Bandit Feedback

Dec 09, 2022
Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

Figure 1 for Multi-Task Off-Policy Learning from Bandit Feedback
Figure 2 for Multi-Task Off-Policy Learning from Bandit Feedback
Figure 3 for Multi-Task Off-Policy Learning from Bandit Feedback
Viaarxiv icon

Operator Splitting Value Iteration

Nov 25, 2022
Amin Rakhsha, Andrew Wang, Mohammad Ghavamzadeh, Amir-massoud Farahmand

Figure 1 for Operator Splitting Value Iteration
Figure 2 for Operator Splitting Value Iteration
Figure 3 for Operator Splitting Value Iteration
Figure 4 for Operator Splitting Value Iteration
Viaarxiv icon

RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk

Sep 14, 2022
Jia Lin Hau, Marek Petrik, Mohammad Ghavamzadeh, Reazul Russel

Figure 1 for RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
Figure 2 for RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
Figure 3 for RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
Figure 4 for RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk
Viaarxiv icon

Robust Reinforcement Learning using Offline Data

Aug 10, 2022
Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, Mohammad Ghavamzadeh

Figure 1 for Robust Reinforcement Learning using Offline Data
Figure 2 for Robust Reinforcement Learning using Offline Data
Figure 3 for Robust Reinforcement Learning using Offline Data
Figure 4 for Robust Reinforcement Learning using Offline Data
Viaarxiv icon

Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings

Jul 01, 2022
Jorge A. Mendez, Alborz Geramifard, Mohammad Ghavamzadeh, Bing Liu

Figure 1 for Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings
Figure 2 for Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings
Figure 3 for Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings
Figure 4 for Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings
Viaarxiv icon

A Mixture-of-Expert Approach to RL-based Dialogue Management

May 31, 2022
Yinlam Chow, Aza Tulepbergenov, Ofir Nachum, MoonKyung Ryu, Mohammad Ghavamzadeh, Craig Boutilier

Figure 1 for A Mixture-of-Expert Approach to RL-based Dialogue Management
Figure 2 for A Mixture-of-Expert Approach to RL-based Dialogue Management
Figure 3 for A Mixture-of-Expert Approach to RL-based Dialogue Management
Figure 4 for A Mixture-of-Expert Approach to RL-based Dialogue Management
Viaarxiv icon

Collaborative Multi-agent Stochastic Linear Bandits

May 12, 2022
Ahmadreza Moradipari, Mohammad Ghavamzadeh, Mahnoosh Alizadeh

Figure 1 for Collaborative Multi-agent Stochastic Linear Bandits
Figure 2 for Collaborative Multi-agent Stochastic Linear Bandits
Viaarxiv icon