Alert button
Picture for Zhihan Liu

Zhihan Liu

Alert button

How Can LLM Guide RL? A Value-Based Approach

Feb 25, 2024
Shenao Zhang, Sirui Zheng, Shuqi Ke, Zhihan Liu, Wanxin Jin, Jianbo Yuan, Yingxiang Yang, Hongxia Yang, Zhaoran Wang

Viaarxiv icon

A Principled Framework for Knowledge-enhanced Large Language Model

Nov 18, 2023
Saizhuo Wang, Zhihan Liu, Zhaoran Wang, Jian Guo

Viaarxiv icon

Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency

Oct 11, 2023
Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang

Figure 1 for Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Figure 2 for Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Figure 3 for Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Figure 4 for Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Viaarxiv icon

Sample-Efficient Multi-Agent RL: An Optimization Perspective

Oct 10, 2023
Nuoya Xiong, Zhihan Liu, Zhaoran Wang, Zhuoran Yang

Viaarxiv icon

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

May 29, 2023
Zhihan Liu, Miao Lu, Wei Xiong, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang

Figure 1 for One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration
Figure 2 for One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration
Figure 3 for One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration
Figure 4 for One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration
Viaarxiv icon

Guarded Policy Optimization with Imperfect Online Demonstrations

Mar 03, 2023
Zhenghai Xue, Zhenghao Peng, Quanyi Li, Zhihan Liu, Bolei Zhou

Figure 1 for Guarded Policy Optimization with Imperfect Online Demonstrations
Figure 2 for Guarded Policy Optimization with Imperfect Online Demonstrations
Figure 3 for Guarded Policy Optimization with Imperfect Online Demonstrations
Figure 4 for Guarded Policy Optimization with Imperfect Online Demonstrations
Viaarxiv icon

Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation

Aug 19, 2021
Zhihan Liu, Yufeng Zhang, Zuyue Fu, Zhuoran Yang, Zhaoran Wang

Viaarxiv icon