Alert button
Picture for David Brandfonbrener

David Brandfonbrener

Alert button

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Feb 22, 2024
Kenneth Li, Samy Jelassi, Hugh Zhang, Sham Kakade, Martin Wattenberg, David Brandfonbrener

Viaarxiv icon

Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search

Feb 13, 2024
David Brandfonbrener, Sibi Raja, Tarun Prasad, Chloe Loughridge, Jianang Yang, Simon Henniger, William E. Byrd, Robert Zinkov, Nada Amin

Viaarxiv icon

Repeat After Me: Transformers are Better than State Space Models at Copying

Feb 01, 2024
Samy Jelassi, David Brandfonbrener, Sham M. Kakade, Eran Malach

Viaarxiv icon

Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation

May 26, 2023
David Brandfonbrener, Ofir Nachum, Joan Bruna

Figure 1 for Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation
Figure 2 for Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation
Figure 3 for Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation
Figure 4 for Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation
Viaarxiv icon

Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning

Oct 05, 2022
David Brandfonbrener, Stephen Tu, Avi Singh, Stefan Welker, Chad Boodoo, Nikolai Matni, Jake Varley

Figure 1 for Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning
Figure 2 for Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning
Figure 3 for Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning
Figure 4 for Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning
Viaarxiv icon

Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning

Jun 02, 2022
David Brandfonbrener, Remi Tachet des Combes, Romain Laroche

Figure 1 for Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
Figure 2 for Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
Figure 3 for Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
Figure 4 for Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
Viaarxiv icon

When does return-conditioned supervised learning work for offline reinforcement learning?

Jun 02, 2022
David Brandfonbrener, Alberto Bietti, Jacob Buckman, Romain Laroche, Joan Bruna

Figure 1 for When does return-conditioned supervised learning work for offline reinforcement learning?
Figure 2 for When does return-conditioned supervised learning work for offline reinforcement learning?
Figure 3 for When does return-conditioned supervised learning work for offline reinforcement learning?
Figure 4 for When does return-conditioned supervised learning work for offline reinforcement learning?
Viaarxiv icon

Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning

Feb 08, 2022
Denis Yarats, David Brandfonbrener, Hao Liu, Michael Laskin, Pieter Abbeel, Alessandro Lazaric, Lerrel Pinto

Figure 1 for Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning
Figure 2 for Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning
Figure 3 for Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning
Figure 4 for Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning
Viaarxiv icon

Quantile Filtered Imitation Learning

Dec 02, 2021
David Brandfonbrener, William F. Whitney, Rajesh Ranganath, Joan Bruna

Figure 1 for Quantile Filtered Imitation Learning
Figure 2 for Quantile Filtered Imitation Learning
Figure 3 for Quantile Filtered Imitation Learning
Figure 4 for Quantile Filtered Imitation Learning
Viaarxiv icon

Offline RL Without Off-Policy Evaluation

Jun 16, 2021
David Brandfonbrener, William F. Whitney, Rajesh Ranganath, Joan Bruna

Figure 1 for Offline RL Without Off-Policy Evaluation
Figure 2 for Offline RL Without Off-Policy Evaluation
Figure 3 for Offline RL Without Off-Policy Evaluation
Figure 4 for Offline RL Without Off-Policy Evaluation
Viaarxiv icon