Picture for Tom Le Paine

Tom Le Paine

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Reinforced Self-Training (ReST) for Language Modeling

Add code
Aug 21, 2023
Figure 1 for Reinforced Self-Training (ReST) for Language Modeling
Figure 2 for Reinforced Self-Training (ReST) for Language Modeling
Figure 3 for Reinforced Self-Training (ReST) for Language Modeling
Figure 4 for Reinforced Self-Training (ReST) for Language Modeling
Viaarxiv icon

AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning

Add code
Aug 07, 2023
Figure 1 for AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Figure 2 for AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Figure 3 for AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Figure 4 for AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Viaarxiv icon

$\pi2\text{vec}$: Policy Representations with Successor Features

Add code
Jun 16, 2023
Figure 1 for $\pi2\text{vec}$: Policy Representations with Successor Features
Figure 2 for $\pi2\text{vec}$: Policy Representations with Successor Features
Figure 3 for $\pi2\text{vec}$: Policy Representations with Successor Features
Figure 4 for $\pi2\text{vec}$: Policy Representations with Successor Features
Viaarxiv icon

On Instrumental Variable Regression for Deep Offline Policy Evaluation

Add code
May 21, 2021
Figure 1 for On Instrumental Variable Regression for Deep Offline Policy Evaluation
Figure 2 for On Instrumental Variable Regression for Deep Offline Policy Evaluation
Figure 3 for On Instrumental Variable Regression for Deep Offline Policy Evaluation
Figure 4 for On Instrumental Variable Regression for Deep Offline Policy Evaluation
Viaarxiv icon

Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization

Add code
Apr 28, 2021
Figure 1 for Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Figure 2 for Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Figure 3 for Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Figure 4 for Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Viaarxiv icon

Benchmarks for Deep Off-Policy Evaluation

Add code
Mar 30, 2021
Figure 1 for Benchmarks for Deep Off-Policy Evaluation
Figure 2 for Benchmarks for Deep Off-Policy Evaluation
Figure 3 for Benchmarks for Deep Off-Policy Evaluation
Figure 4 for Benchmarks for Deep Off-Policy Evaluation
Viaarxiv icon

Hyperparameter Selection for Offline Reinforcement Learning

Add code
Jul 17, 2020
Figure 1 for Hyperparameter Selection for Offline Reinforcement Learning
Figure 2 for Hyperparameter Selection for Offline Reinforcement Learning
Figure 3 for Hyperparameter Selection for Offline Reinforcement Learning
Figure 4 for Hyperparameter Selection for Offline Reinforcement Learning
Viaarxiv icon

RL Unplugged: Benchmarks for Offline Reinforcement Learning

Add code
Jul 02, 2020
Figure 1 for RL Unplugged: Benchmarks for Offline Reinforcement Learning
Figure 2 for RL Unplugged: Benchmarks for Offline Reinforcement Learning
Figure 3 for RL Unplugged: Benchmarks for Offline Reinforcement Learning
Figure 4 for RL Unplugged: Benchmarks for Offline Reinforcement Learning
Viaarxiv icon