Picture for Benjamin Van Roy

Benjamin Van Roy

Stanford University Department of Electrical Engineering

Efficient Exploration for LLMs

Add code
Feb 01, 2024
Figure 1 for Efficient Exploration for LLMs
Figure 2 for Efficient Exploration for LLMs
Figure 3 for Efficient Exploration for LLMs
Figure 4 for Efficient Exploration for LLMs
Viaarxiv icon

An Information-Theoretic Analysis of In-Context Learning

Add code
Jan 28, 2024
Viaarxiv icon

RLHF and IIA: Perverse Incentives

Add code
Dec 02, 2023
Figure 1 for RLHF and IIA: Perverse Incentives
Figure 2 for RLHF and IIA: Perverse Incentives
Figure 3 for RLHF and IIA: Perverse Incentives
Figure 4 for RLHF and IIA: Perverse Incentives
Viaarxiv icon

Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling

Add code
Oct 14, 2023
Figure 1 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 2 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 3 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 4 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Viaarxiv icon

Maintaining Plasticity via Regenerative Regularization

Add code
Aug 23, 2023
Viaarxiv icon

A Definition of Continual Reinforcement Learning

Add code
Jul 20, 2023
Viaarxiv icon

On the Convergence of Bounded Agents

Add code
Jul 20, 2023
Viaarxiv icon

Continual Learning as Computationally Constrained Reinforcement Learning

Add code
Jul 10, 2023
Figure 1 for Continual Learning as Computationally Constrained Reinforcement Learning
Figure 2 for Continual Learning as Computationally Constrained Reinforcement Learning
Figure 3 for Continual Learning as Computationally Constrained Reinforcement Learning
Figure 4 for Continual Learning as Computationally Constrained Reinforcement Learning
Viaarxiv icon

Scalable Neural Contextual Bandit for Recommender Systems

Add code
Jun 26, 2023
Viaarxiv icon

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

Add code
May 19, 2023
Figure 1 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 2 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 3 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 4 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Viaarxiv icon