Picture for Lennie Wells

Lennie Wells

KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF

Add code
Aug 23, 2025
Figure 1 for KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF
Figure 2 for KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF
Figure 3 for KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF
Figure 4 for KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF
Viaarxiv icon

Some Notes on the Sample Complexity of Approximate Channel Simulation

Add code
May 14, 2024
Viaarxiv icon

Efficient Algorithms for the CCA Family: Unconstrained Objectives with Unbiased Gradients

Add code
Oct 02, 2023
Figure 1 for Efficient Algorithms for the CCA Family: Unconstrained Objectives with Unbiased Gradients
Figure 2 for Efficient Algorithms for the CCA Family: Unconstrained Objectives with Unbiased Gradients
Figure 3 for Efficient Algorithms for the CCA Family: Unconstrained Objectives with Unbiased Gradients
Figure 4 for Efficient Algorithms for the CCA Family: Unconstrained Objectives with Unbiased Gradients
Viaarxiv icon

A Generalized EigenGame with Extensions to Multiview Representation Learning

Add code
Nov 21, 2022
Viaarxiv icon