Picture for Haipeng Luo

Haipeng Luo

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

Add code
Nov 01, 2023
Viaarxiv icon

Online Learning in Contextual Second-Price Pay-Per-Click Auctions

Add code
Oct 08, 2023
Viaarxiv icon

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Add code
Aug 18, 2023
Figure 1 for WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Figure 2 for WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Figure 3 for WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Figure 4 for WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Viaarxiv icon

No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions

Add code
May 30, 2023
Viaarxiv icon

Regret Matching+: (In)Stability and Fast Convergence in Games

Add code
May 24, 2023
Viaarxiv icon

Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games

Add code
Mar 05, 2023
Figure 1 for Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games
Viaarxiv icon

Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms

Add code
Feb 27, 2023
Figure 1 for Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms
Figure 2 for Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms
Figure 3 for Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms
Viaarxiv icon

Average-Constrained Policy Optimization

Add code
Feb 02, 2023
Viaarxiv icon

Refined Regret for Adversarial MDPs with Linear Function Approximation

Add code
Jan 30, 2023
Figure 1 for Refined Regret for Adversarial MDPs with Linear Function Approximation
Viaarxiv icon

Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Add code
Dec 31, 2022
Figure 1 for Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Figure 2 for Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Figure 3 for Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Figure 4 for Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Viaarxiv icon