Alert button
Picture for Jiantao Jiao

Jiantao Jiao

Alert button

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

Add code
Bookmark button
Alert button
Jun 06, 2023
Banghua Zhu, Hiteshi Sharma, Felipe Vieira Frujeri, Shi Dong, Chenguang Zhu, Michael I. Jordan, Jiantao Jiao

Figure 1 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 2 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 3 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 4 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Viaarxiv icon

On Optimal Caching and Model Multiplexing for Large Model Inference

Add code
Bookmark button
Alert button
Jun 03, 2023
Banghua Zhu, Ying Sheng, Lianmin Zheng, Clark Barrett, Michael I. Jordan, Jiantao Jiao

Figure 1 for On Optimal Caching and Model Multiplexing for Large Model Inference
Figure 2 for On Optimal Caching and Model Multiplexing for Large Model Inference
Figure 3 for On Optimal Caching and Model Multiplexing for Large Model Inference
Figure 4 for On Optimal Caching and Model Multiplexing for Large Model Inference
Viaarxiv icon

Doubly Robust Self-Training

Add code
Bookmark button
Alert button
Jun 01, 2023
Banghua Zhu, Mingyu Ding, Philip Jacobson, Ming Wu, Wei Zhan, Michael Jordan, Jiantao Jiao

Figure 1 for Doubly Robust Self-Training
Figure 2 for Doubly Robust Self-Training
Figure 3 for Doubly Robust Self-Training
Figure 4 for Doubly Robust Self-Training
Viaarxiv icon

Online Learning in a Creator Economy

Add code
Bookmark button
Alert button
May 19, 2023
Banghua Zhu, Sai Praneeth Karimireddy, Jiantao Jiao, Michael I. Jordan

Viaarxiv icon

Beyond UCB: Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Add code
Bookmark button
Alert button
Feb 12, 2023
Nived Rajaraman, Yanjun Han, Jiantao Jiao, Kannan Ramchandran

Figure 1 for Beyond UCB: Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits
Figure 2 for Beyond UCB: Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits
Viaarxiv icon

Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Jan 30, 2023
Hanlin Zhu, Paria Rashidinejad, Jiantao Jiao

Figure 1 for Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Figure 2 for Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Viaarxiv icon

Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons

Add code
Bookmark button
Alert button
Jan 30, 2023
Banghua Zhu, Jiantao Jiao, Michael I. Jordan

Figure 1 for Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons
Figure 2 for Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons
Viaarxiv icon

Online Learning in Stackelberg Games with an Omniscient Follower

Add code
Bookmark button
Alert button
Jan 27, 2023
Geng Zhao, Banghua Zhu, Jiantao Jiao, Michael I. Jordan

Viaarxiv icon

The Sample Complexity of Online Contract Design

Add code
Bookmark button
Alert button
Nov 10, 2022
Banghua Zhu, Stephen Bates, Zhuoran Yang, Yixin Wang, Jiantao Jiao, Michael I. Jordan

Figure 1 for The Sample Complexity of Online Contract Design
Figure 2 for The Sample Complexity of Online Contract Design
Viaarxiv icon