Alert button
Picture for Erdem Bıyık

Erdem Bıyık

Alert button

ViSaRL: Visual Reinforcement Learning Guided by Human Saliency

Add code
Bookmark button
Alert button
Mar 16, 2024
Anthony Liang, Jesse Thomason, Erdem Bıyık

Figure 1 for ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
Figure 2 for ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
Figure 3 for ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
Figure 4 for ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
Viaarxiv icon

A Generalized Acquisition Function for Preference-based Reward Learning

Add code
Bookmark button
Alert button
Mar 09, 2024
Evan Ellis, Gaurav R. Ghosal, Stuart J. Russell, Anca Dragan, Erdem Bıyık

Figure 1 for A Generalized Acquisition Function for Preference-based Reward Learning
Figure 2 for A Generalized Acquisition Function for Preference-based Reward Learning
Viaarxiv icon

DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 25, 2024
Anthony Liang, Guy Tennenholtz, Chih-wei Hsu, Yinlam Chow, Erdem Bıyık, Craig Boutilier

Viaarxiv icon

Batch Active Learning of Reward Functions from Human Preferences

Add code
Bookmark button
Alert button
Feb 24, 2024
Erdem Bıyık, Nima Anari, Dorsa Sadigh

Viaarxiv icon

RoboCLIP: One Demonstration is Enough to Learn Robot Policies

Add code
Bookmark button
Alert button
Oct 11, 2023
Sumedh A Sontakke, Jesse Zhang, Sébastien M. R. Arnold, Karl Pertsch, Erdem Bıyık, Dorsa Sadigh, Chelsea Finn, Laurent Itti

Figure 1 for RoboCLIP: One Demonstration is Enough to Learn Robot Policies
Figure 2 for RoboCLIP: One Demonstration is Enough to Learn Robot Policies
Figure 3 for RoboCLIP: One Demonstration is Enough to Learn Robot Policies
Figure 4 for RoboCLIP: One Demonstration is Enough to Learn Robot Policies
Viaarxiv icon

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Add code
Bookmark button
Alert button
Jul 27, 2023
Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Bıyık, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

Figure 1 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 2 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 3 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 4 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Viaarxiv icon

Active Reward Learning from Online Preferences

Add code
Bookmark button
Alert button
Feb 27, 2023
Vivek Myers, Erdem Bıyık, Dorsa Sadigh

Figure 1 for Active Reward Learning from Online Preferences
Figure 2 for Active Reward Learning from Online Preferences
Figure 3 for Active Reward Learning from Online Preferences
Figure 4 for Active Reward Learning from Online Preferences
Viaarxiv icon

Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction

Add code
Bookmark button
Alert button
Mar 19, 2022
Zhangjie Cao, Erdem Bıyık, Guy Rosman, Dorsa Sadigh

Figure 1 for Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction
Figure 2 for Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction
Figure 3 for Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction
Figure 4 for Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction
Viaarxiv icon

Learning Multimodal Rewards from Rankings

Add code
Bookmark button
Alert button
Oct 18, 2021
Vivek Myers, Erdem Bıyık, Nima Anari, Dorsa Sadigh

Figure 1 for Learning Multimodal Rewards from Rankings
Figure 2 for Learning Multimodal Rewards from Rankings
Figure 3 for Learning Multimodal Rewards from Rankings
Viaarxiv icon