Alert button
Picture for Haipeng Luo

Haipeng Luo

Alert button

Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments

May 25, 2022
Liyu Chen, Haipeng Luo

Viaarxiv icon

Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games

Apr 25, 2022
Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Tuomas Sandholm

Figure 1 for Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games
Figure 2 for Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games
Figure 3 for Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games
Viaarxiv icon

Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits

Feb 12, 2022
Haipeng Luo, Mengxiao Zhang, Peng Zhao, Zhi-Hua Zhou

Viaarxiv icon

Adaptive Bandit Convex Optimization with Heterogeneous Curvature

Feb 12, 2022
Haipeng Luo, Mengxiao Zhang, Peng Zhao

Figure 1 for Adaptive Bandit Convex Optimization with Heterogeneous Curvature
Figure 2 for Adaptive Bandit Convex Optimization with Heterogeneous Curvature
Viaarxiv icon

Policy Optimization for Stochastic Shortest Path

Feb 07, 2022
Liyu Chen, Haipeng Luo, Aviv Rosenberg

Viaarxiv icon

Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games

Feb 01, 2022
Gabriele Farina, Chung-Wei Lee, Haipeng Luo, Christian Kroer

Figure 1 for Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games
Figure 2 for Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games
Figure 3 for Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games
Figure 4 for Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games
Viaarxiv icon

Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints

Jan 31, 2022
Liyu Chen, Rahul Jain, Haipeng Luo

Viaarxiv icon

Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

Jan 31, 2022
Tiancheng Jin, Tal Lancewicki, Haipeng Luo, Yishay Mansour, Aviv Rosenberg

Figure 1 for Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Viaarxiv icon

No-Regret Learning in Time-Varying Zero-Sum Games

Jan 30, 2022
Mengxiao Zhang, Peng Zhao, Haipeng Luo, Zhi-Hua Zhou

Figure 1 for No-Regret Learning in Time-Varying Zero-Sum Games
Figure 2 for No-Regret Learning in Time-Varying Zero-Sum Games
Viaarxiv icon

Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP

Dec 18, 2021
Liyu Chen, Rahul Jain, Haipeng Luo

Viaarxiv icon