Alert button
Picture for Lihong Li

Lihong Li

Alert button

MESOB: Balancing Equilibria & Social Optimality

Jul 16, 2023
Xin Guo, Lihong Li, Sareh Nabi, Rabih Salhab, Junzi Zhang

Figure 1 for MESOB: Balancing Equilibria & Social Optimality
Figure 2 for MESOB: Balancing Equilibria & Social Optimality
Figure 3 for MESOB: Balancing Equilibria & Social Optimality
Figure 4 for MESOB: Balancing Equilibria & Social Optimality
Viaarxiv icon

Offline Policy Optimization in RL with Variance Regularizaton

Dec 29, 2022
Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Animesh Garg, Zhaoran Wang, Lihong Li, Doina Precup

Figure 1 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 2 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 3 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 4 for Offline Policy Optimization in RL with Variance Regularizaton
Viaarxiv icon

A Reinforcement Learning Approach to Estimating Long-term Treatment Effects

Oct 14, 2022
Ziyang Tang, Yiheng Duan, Stephanie Zhang, Lihong Li

Figure 1 for A Reinforcement Learning Approach to Estimating Long-term Treatment Effects
Figure 2 for A Reinforcement Learning Approach to Estimating Long-term Treatment Effects
Figure 3 for A Reinforcement Learning Approach to Estimating Long-term Treatment Effects
Figure 4 for A Reinforcement Learning Approach to Estimating Long-term Treatment Effects
Viaarxiv icon

Understanding Domain Randomization for Sim-to-real Transfer

Oct 07, 2021
Xiaoyu Chen, Jiachen Hu, Chi Jin, Lihong Li, Liwei Wang

Figure 1 for Understanding Domain Randomization for Sim-to-real Transfer
Figure 2 for Understanding Domain Randomization for Sim-to-real Transfer
Figure 3 for Understanding Domain Randomization for Sim-to-real Transfer
Viaarxiv icon

A Map of Bandits for E-commerce

Jul 01, 2021
Yi Liu, Lihong Li

Figure 1 for A Map of Bandits for E-commerce
Figure 2 for A Map of Bandits for E-commerce
Figure 3 for A Map of Bandits for E-commerce
Figure 4 for A Map of Bandits for E-commerce
Viaarxiv icon

On the Optimality of Batch Policy Optimization Algorithms

Apr 06, 2021
Chenjun Xiao, Yifan Wu, Tor Lattimore, Bo Dai, Jincheng Mei, Lihong Li, Csaba Szepesvari, Dale Schuurmans

Figure 1 for On the Optimality of Batch Policy Optimization Algorithms
Figure 2 for On the Optimality of Batch Policy Optimization Algorithms
Viaarxiv icon

Near-optimal Representation Learning for Linear Bandits and Linear RL

Feb 08, 2021
Jiachen Hu, Xiaoyu Chen, Chi Jin, Lihong Li, Liwei Wang

Viaarxiv icon

CoinDICE: Off-Policy Confidence Interval Estimation

Oct 22, 2020
Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans

Figure 1 for CoinDICE: Off-Policy Confidence Interval Estimation
Figure 2 for CoinDICE: Off-Policy Confidence Interval Estimation
Figure 3 for CoinDICE: Off-Policy Confidence Interval Estimation
Viaarxiv icon

Neural Thompson Sampling

Oct 02, 2020
Weitong Zhang, Dongruo Zhou, Lihong Li, Quanquan Gu

Figure 1 for Neural Thompson Sampling
Figure 2 for Neural Thompson Sampling
Figure 3 for Neural Thompson Sampling
Figure 4 for Neural Thompson Sampling
Viaarxiv icon

Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL

Sep 15, 2020
Xiaoyu Chen, Jiachen Hu, Lihong Li, Liwei Wang

Viaarxiv icon