Picture for Audrey Huang

Audrey Huang

A Unifying View of Coverage in Linear Off-Policy Evaluation

Add code
Jan 26, 2026
Viaarxiv icon

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Add code
Mar 27, 2025
Figure 1 for Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Figure 2 for Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Figure 3 for Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Figure 4 for Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Viaarxiv icon

Computational-Statistical Tradeoffs at the Next-Token Prediction Barrier: Autoregressive and Imitation Learning under Misspecification

Add code
Feb 18, 2025
Viaarxiv icon

Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol

Add code
Feb 11, 2025
Figure 1 for Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Figure 2 for Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Figure 3 for Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Figure 4 for Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Viaarxiv icon

Self-Improvement in Language Models: The Sharpening Mechanism

Add code
Dec 02, 2024
Viaarxiv icon

Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization

Add code
Jul 18, 2024
Figure 1 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Figure 2 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Figure 3 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Viaarxiv icon

Reinforcement Learning in Low-Rank MDPs with Density Features

Add code
Feb 04, 2023
Viaarxiv icon

Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions

Add code
Oct 27, 2022
Viaarxiv icon

Off-Policy Risk Assessment in Markov Decision Processes

Add code
Sep 21, 2022
Figure 1 for Off-Policy Risk Assessment in Markov Decision Processes
Figure 2 for Off-Policy Risk Assessment in Markov Decision Processes
Viaarxiv icon

Supervised Learning with General Risk Functionals

Add code
Jun 27, 2022
Figure 1 for Supervised Learning with General Risk Functionals
Figure 2 for Supervised Learning with General Risk Functionals
Figure 3 for Supervised Learning with General Risk Functionals
Figure 4 for Supervised Learning with General Risk Functionals
Viaarxiv icon