Alert button
Picture for Joey Hong

Joey Hong

Alert button

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

Add code
Bookmark button
Alert button
Nov 30, 2023
Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine

Viaarxiv icon

Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations

Add code
Bookmark button
Alert button
Nov 09, 2023
Joey Hong, Sergey Levine, Anca Dragan

Viaarxiv icon

Offline RL with Observation Histories: Analyzing and Improving Sample Complexity

Add code
Bookmark button
Alert button
Oct 31, 2023
Joey Hong, Anca Dragan, Sergey Levine

Viaarxiv icon

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis

Add code
Bookmark button
Alert button
Jul 26, 2023
Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton

Viaarxiv icon

Learning to Influence Human Behavior with Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 10, 2023
Joey Hong, Anca Dragan, Sergey Levine

Figure 1 for Learning to Influence Human Behavior with Offline Reinforcement Learning
Figure 2 for Learning to Influence Human Behavior with Offline Reinforcement Learning
Figure 3 for Learning to Influence Human Behavior with Offline Reinforcement Learning
Figure 4 for Learning to Influence Human Behavior with Offline Reinforcement Learning
Viaarxiv icon

Multi-Task Off-Policy Learning from Bandit Feedback

Add code
Bookmark button
Alert button
Dec 09, 2022
Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

Figure 1 for Multi-Task Off-Policy Learning from Bandit Feedback
Figure 2 for Multi-Task Off-Policy Learning from Bandit Feedback
Figure 3 for Multi-Task Off-Policy Learning from Bandit Feedback
Viaarxiv icon

On the Sensitivity of Reward Inference to Misspecified Human Models

Add code
Bookmark button
Alert button
Dec 09, 2022
Joey Hong, Kush Bhatia, Anca Dragan

Figure 1 for On the Sensitivity of Reward Inference to Misspecified Human Models
Figure 2 for On the Sensitivity of Reward Inference to Misspecified Human Models
Figure 3 for On the Sensitivity of Reward Inference to Misspecified Human Models
Figure 4 for On the Sensitivity of Reward Inference to Misspecified Human Models
Viaarxiv icon

Confidence-Conditioned Value Functions for Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Dec 08, 2022
Joey Hong, Aviral Kumar, Sergey Levine

Figure 1 for Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Figure 2 for Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Figure 3 for Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Figure 4 for Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Viaarxiv icon

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?

Add code
Bookmark button
Alert button
Apr 12, 2022
Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine

Figure 1 for When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Figure 2 for When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Figure 3 for When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Figure 4 for When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Viaarxiv icon

Compositional Generalization and Decomposition in Neural Program Synthesis

Add code
Bookmark button
Alert button
Apr 07, 2022
Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton

Figure 1 for Compositional Generalization and Decomposition in Neural Program Synthesis
Figure 2 for Compositional Generalization and Decomposition in Neural Program Synthesis
Figure 3 for Compositional Generalization and Decomposition in Neural Program Synthesis
Figure 4 for Compositional Generalization and Decomposition in Neural Program Synthesis
Viaarxiv icon