Alert button
Picture for Ziniu Li

Ziniu Li

Alert button

Why Transformers Need Adam: A Hessian Perspective

Add code
Bookmark button
Alert button
Feb 26, 2024
Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo

Viaarxiv icon

Policy Optimization in RLHF: The Impact of Out-of-preference Data

Add code
Bookmark button
Alert button
Dec 17, 2023
Ziniu Li, Tian Xu, Yang Yu

Viaarxiv icon

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

Add code
Bookmark button
Alert button
Oct 17, 2023
Ziniu Li, Tian Xu, Yushun Zhang, Yang Yu, Ruoyu Sun, Zhi-Quan Luo

Viaarxiv icon

Provably Efficient Adversarial Imitation Learning with Unknown Transitions

Add code
Bookmark button
Alert button
Jun 11, 2023
Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo

Figure 1 for Provably Efficient Adversarial Imitation Learning with Unknown Transitions
Figure 2 for Provably Efficient Adversarial Imitation Learning with Unknown Transitions
Figure 3 for Provably Efficient Adversarial Imitation Learning with Unknown Transitions
Viaarxiv icon

Deploying Offline Reinforcement Learning with Human Feedback

Add code
Bookmark button
Alert button
Mar 13, 2023
Ziniu Li, Ke Xu, Liu Liu, Lanqing Li, Deheng Ye, Peilin Zhao

Figure 1 for Deploying Offline Reinforcement Learning with Human Feedback
Figure 2 for Deploying Offline Reinforcement Learning with Human Feedback
Figure 3 for Deploying Offline Reinforcement Learning with Human Feedback
Figure 4 for Deploying Offline Reinforcement Learning with Human Feedback
Viaarxiv icon

Theoretical Analysis of Offline Imitation With Supplementary Dataset

Add code
Bookmark button
Alert button
Jan 27, 2023
Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo

Figure 1 for Theoretical Analysis of Offline Imitation With Supplementary Dataset
Figure 2 for Theoretical Analysis of Offline Imitation With Supplementary Dataset
Figure 3 for Theoretical Analysis of Offline Imitation With Supplementary Dataset
Figure 4 for Theoretical Analysis of Offline Imitation With Supplementary Dataset
Viaarxiv icon

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis

Add code
Bookmark button
Alert button
Aug 03, 2022
Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo

Figure 1 for Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis
Figure 2 for Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis
Figure 3 for Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis
Figure 4 for Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis
Viaarxiv icon

A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle

Add code
Bookmark button
Alert button
Mar 22, 2022
Ziniu Li, Tian Xu, Yang Yu

Figure 1 for A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle
Viaarxiv icon

Rethinking ValueDice: Does It Really Improve Performance?

Add code
Bookmark button
Alert button
Feb 05, 2022
Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo

Figure 1 for Rethinking ValueDice: Does It Really Improve Performance?
Figure 2 for Rethinking ValueDice: Does It Really Improve Performance?
Figure 3 for Rethinking ValueDice: Does It Really Improve Performance?
Figure 4 for Rethinking ValueDice: Does It Really Improve Performance?
Viaarxiv icon

Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions

Add code
Bookmark button
Alert button
Jun 19, 2021
Tian Xu, Ziniu Li, Yang Yu

Figure 1 for Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions
Figure 2 for Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions
Figure 3 for Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions
Figure 4 for Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions
Viaarxiv icon