Alert button
Picture for Pieter Abbeel

Pieter Abbeel

Alert button

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Add code
Bookmark button
Alert button
Mar 07, 2023
Sherry Yang, Ofir Nachum, Yilun Du, Jason Wei, Pieter Abbeel, Dale Schuurmans

Figure 1 for Foundation Models for Decision Making: Problems, Methods, and Opportunities
Figure 2 for Foundation Models for Decision Making: Problems, Methods, and Opportunities
Figure 3 for Foundation Models for Decision Making: Problems, Methods, and Opportunities
Figure 4 for Foundation Models for Decision Making: Problems, Methods, and Opportunities
Viaarxiv icon

Preference Transformer: Modeling Human Preferences using Transformers for RL

Add code
Bookmark button
Alert button
Mar 02, 2023
Changyeon Kim, Jongjin Park, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee

Figure 1 for Preference Transformer: Modeling Human Preferences using Transformers for RL
Figure 2 for Preference Transformer: Modeling Human Preferences using Transformers for RL
Figure 3 for Preference Transformer: Modeling Human Preferences using Transformers for RL
Figure 4 for Preference Transformer: Modeling Human Preferences using Transformers for RL
Viaarxiv icon

Chain of Hindsight Aligns Language Models with Feedback

Add code
Bookmark button
Alert button
Feb 27, 2023
Hao Liu, Carmelo Sferrazza, Pieter Abbeel

Figure 1 for Chain of Hindsight Aligns Language Models with Feedback
Figure 2 for Chain of Hindsight Aligns Language Models with Feedback
Figure 3 for Chain of Hindsight Aligns Language Models with Feedback
Figure 4 for Chain of Hindsight Aligns Language Models with Feedback
Viaarxiv icon

Aligning Text-to-Image Models using Human Feedback

Add code
Bookmark button
Alert button
Feb 23, 2023
Kimin Lee, Hao Liu, Moonkyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu

Figure 1 for Aligning Text-to-Image Models using Human Feedback
Figure 2 for Aligning Text-to-Image Models using Human Feedback
Figure 3 for Aligning Text-to-Image Models using Human Feedback
Figure 4 for Aligning Text-to-Image Models using Human Feedback
Viaarxiv icon

Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 19, 2023
Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Figure 1 for Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning
Figure 2 for Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning
Figure 3 for Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning
Figure 4 for Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning
Viaarxiv icon

Languages are Rewards: Chain of Hindsight Finetuning using Human Feedback

Add code
Bookmark button
Alert button
Feb 13, 2023
Hao Liu, Carmelo Sferrazza, Pieter Abbeel

Figure 1 for Languages are Rewards: Chain of Hindsight Finetuning using Human Feedback
Figure 2 for Languages are Rewards: Chain of Hindsight Finetuning using Human Feedback
Figure 3 for Languages are Rewards: Chain of Hindsight Finetuning using Human Feedback
Figure 4 for Languages are Rewards: Chain of Hindsight Finetuning using Human Feedback
Viaarxiv icon

Guiding Pretraining in Reinforcement Learning with Large Language Models

Add code
Bookmark button
Alert button
Feb 13, 2023
Yuqing Du, Olivia Watkins, Zihan Wang, Cédric Colas, Trevor Darrell, Pieter Abbeel, Abhishek Gupta, Jacob Andreas

Figure 1 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 2 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 3 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 4 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Viaarxiv icon

Controllability-Aware Unsupervised Skill Discovery

Add code
Bookmark button
Alert button
Feb 13, 2023
Seohong Park, Kimin Lee, Youngwoon Lee, Pieter Abbeel

Figure 1 for Controllability-Aware Unsupervised Skill Discovery
Figure 2 for Controllability-Aware Unsupervised Skill Discovery
Figure 3 for Controllability-Aware Unsupervised Skill Discovery
Figure 4 for Controllability-Aware Unsupervised Skill Discovery
Viaarxiv icon

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

Add code
Bookmark button
Alert button
Feb 10, 2023
Tianjun Zhang, Fangchen Liu, Justin Wong, Pieter Abbeel, Joseph E. Gonzalez

Figure 1 for The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Figure 2 for The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Figure 3 for The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Figure 4 for The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Viaarxiv icon

Languages are Rewards: Hindsight Finetuning using Human Feedback

Add code
Bookmark button
Alert button
Feb 06, 2023
Hao Liu, Carmelo Sferrazza, Pieter Abbeel

Figure 1 for Languages are Rewards: Hindsight Finetuning using Human Feedback
Figure 2 for Languages are Rewards: Hindsight Finetuning using Human Feedback
Figure 3 for Languages are Rewards: Hindsight Finetuning using Human Feedback
Figure 4 for Languages are Rewards: Hindsight Finetuning using Human Feedback
Viaarxiv icon