Picture for Jason D. Lee

Jason D. Lee

Task Diversity Shortens the ICL Plateau

Add code
Oct 07, 2024
Figure 1 for Task Diversity Shortens the ICL Plateau
Figure 2 for Task Diversity Shortens the ICL Plateau
Figure 3 for Task Diversity Shortens the ICL Plateau
Figure 4 for Task Diversity Shortens the ICL Plateau
Viaarxiv icon

Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Add code
Oct 06, 2024
Figure 1 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 2 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 3 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 4 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Viaarxiv icon

Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank

Add code
Oct 01, 2024
Figure 1 for Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Figure 2 for Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Figure 3 for Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Figure 4 for Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Viaarxiv icon

Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization

Add code
Jul 18, 2024
Figure 1 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Figure 2 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Figure 3 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Viaarxiv icon

Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity

Add code
Jun 28, 2024
Viaarxiv icon

Scaling Laws in Linear Regression: Compute, Parameters, and Data

Add code
Jun 12, 2024
Figure 1 for Scaling Laws in Linear Regression: Compute, Parameters, and Data
Figure 2 for Scaling Laws in Linear Regression: Compute, Parameters, and Data
Figure 3 for Scaling Laws in Linear Regression: Compute, Parameters, and Data
Figure 4 for Scaling Laws in Linear Regression: Compute, Parameters, and Data
Viaarxiv icon

Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot

Add code
Jun 11, 2024
Figure 1 for Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Figure 2 for Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Figure 3 for Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Figure 4 for Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Viaarxiv icon

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

Add code
Jun 03, 2024
Figure 1 for Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Figure 2 for Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Viaarxiv icon

REBEL: Reinforcement Learning via Regressing Relative Rewards

Add code
Apr 25, 2024
Viaarxiv icon

Dataset Reset Policy Optimization for RLHF

Add code
Apr 15, 2024
Figure 1 for Dataset Reset Policy Optimization for RLHF
Figure 2 for Dataset Reset Policy Optimization for RLHF
Figure 3 for Dataset Reset Policy Optimization for RLHF
Figure 4 for Dataset Reset Policy Optimization for RLHF
Viaarxiv icon