Alert button
Picture for Alekh Agarwal

Alekh Agarwal

Alert button

Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization

Add code
Bookmark button
Alert button
Mar 28, 2024
Teodor V. Marinov, Alekh Agarwal, Mircea Trofin

Figure 1 for Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization
Figure 2 for Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization
Figure 3 for Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization
Figure 4 for Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization
Viaarxiv icon

Stochastic Gradient Succeeds for Bandits

Add code
Bookmark button
Alert button
Feb 27, 2024
Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvari, Dale Schuurmans

Viaarxiv icon

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 11, 2024
Kaiwen Wang, Owen Oertell, Alekh Agarwal, Nathan Kallus, Wen Sun

Viaarxiv icon

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

Add code
Bookmark button
Alert button
Jan 08, 2024
Gokul Swamy, Christoph Dann, Rahul Kidambi, Zhiwei Steven Wu, Alekh Agarwal

Viaarxiv icon

Theoretical guarantees on the best-of-n alignment policy

Add code
Bookmark button
Alert button
Jan 03, 2024
Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D'Amour, Jacob Eisenstein, Chirag Nagpal, Ananda Theertha Suresh

Viaarxiv icon

Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

Add code
Bookmark button
Alert button
Dec 21, 2023
Jacob Eisenstein, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alex D'Amour, DJ Dvijotham, Adam Fisch, Katherine Heller, Stephen Pfohl, Deepak Ramachandran, Peter Shaw, Jonathan Berant

Viaarxiv icon

Efficient End-to-End Visual Document Understanding with Rationale Distillation

Add code
Bookmark button
Alert button
Nov 16, 2023
Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova

Viaarxiv icon

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

Add code
Bookmark button
Alert button
May 26, 2023
Jacob Abernethy, Alekh Agarwal, Teodor V. Marinov, Manfred K. Warmuth

Figure 1 for A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks
Figure 2 for A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks
Figure 3 for A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks
Figure 4 for A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks
Viaarxiv icon

An Empirical Evaluation of Federated Contextual Bandit Algorithms

Add code
Bookmark button
Alert button
Mar 17, 2023
Alekh Agarwal, H. Brendan McMahan, Zheng Xu

Figure 1 for An Empirical Evaluation of Federated Contextual Bandit Algorithms
Figure 2 for An Empirical Evaluation of Federated Contextual Bandit Algorithms
Figure 3 for An Empirical Evaluation of Federated Contextual Bandit Algorithms
Figure 4 for An Empirical Evaluation of Federated Contextual Bandit Algorithms
Viaarxiv icon

Leveraging User-Triggered Supervision in Contextual Bandits

Add code
Bookmark button
Alert button
Feb 07, 2023
Alekh Agarwal, Claudio Gentile, Teodor V. Marinov

Viaarxiv icon