Alert button
Picture for Andrea Zanette

Andrea Zanette

Alert button

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

Add code
Bookmark button
Alert button
Feb 29, 2024
Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar

Viaarxiv icon

Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement

Add code
Bookmark button
Alert button
Feb 24, 2024
Ruiqi Zhang, Yuexiang Zhai, Andrea Zanette

Viaarxiv icon

Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data

Add code
Bookmark button
Alert button
Jul 10, 2023
Ruiqi Zhang, Andrea Zanette

Viaarxiv icon

When is Realizability Sufficient for Off-Policy Reinforcement Learning?

Add code
Bookmark button
Alert button
Nov 10, 2022
Andrea Zanette

Figure 1 for When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Figure 2 for When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Figure 3 for When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Figure 4 for When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Viaarxiv icon

Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning

Add code
Bookmark button
Alert button
Jun 01, 2022
Andrea Zanette, Martin J. Wainwright

Viaarxiv icon

Bellman Residual Orthogonalization for Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 24, 2022
Andrea Zanette, Martin J. Wainwright

Viaarxiv icon

Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Aug 19, 2021
Andrea Zanette, Martin J. Wainwright, Emma Brunskill

Viaarxiv icon

Design of Experiments for Stochastic Contextual Linear Bandits

Add code
Bookmark button
Alert button
Jul 22, 2021
Andrea Zanette, Kefan Dong, Jonathan Lee, Emma Brunskill

Figure 1 for Design of Experiments for Stochastic Contextual Linear Bandits
Figure 2 for Design of Experiments for Stochastic Contextual Linear Bandits
Figure 3 for Design of Experiments for Stochastic Contextual Linear Bandits
Figure 4 for Design of Experiments for Stochastic Contextual Linear Bandits
Viaarxiv icon

Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation

Add code
Bookmark button
Alert button
Mar 24, 2021
Andrea Zanette, Ching-An Cheng, Alekh Agarwal

Figure 1 for Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Viaarxiv icon