Picture for Jiantao Jiao

Jiantao Jiao

Towards Optimal Statistical Watermarking

Add code
Dec 13, 2023
Figure 1 for Towards Optimal Statistical Watermarking
Viaarxiv icon

End-to-end Story Plot Generator

Add code
Oct 13, 2023
Viaarxiv icon

Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment

Add code
Oct 10, 2023
Figure 1 for Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
Figure 2 for Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
Figure 3 for Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
Figure 4 for Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
Viaarxiv icon

Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration

Add code
Sep 18, 2023
Figure 1 for Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration
Figure 2 for Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration
Figure 3 for Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration
Figure 4 for Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration
Viaarxiv icon

Noisy Computing of the $\mathsf{OR}$ and $\mathsf{MAX}$ Functions

Add code
Sep 07, 2023
Viaarxiv icon

On the Optimal Bounds for Noisy Computing

Add code
Jun 21, 2023
Figure 1 for On the Optimal Bounds for Noisy Computing
Viaarxiv icon

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

Add code
Jun 06, 2023
Figure 1 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 2 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 3 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 4 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Viaarxiv icon

On Optimal Caching and Model Multiplexing for Large Model Inference

Add code
Jun 03, 2023
Viaarxiv icon

Doubly Robust Self-Training

Add code
Jun 01, 2023
Figure 1 for Doubly Robust Self-Training
Figure 2 for Doubly Robust Self-Training
Figure 3 for Doubly Robust Self-Training
Figure 4 for Doubly Robust Self-Training
Viaarxiv icon

Online Learning in a Creator Economy

Add code
May 19, 2023
Viaarxiv icon