Alert button
Picture for Jonathan D. Chang

Jonathan D. Chang

Alert button

Dataset Reset Policy Optimization for RLHF

Add code
Bookmark button
Alert button
Apr 16, 2024
Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Kianté Brantley, Dipendra Misra, Jason D. Lee, Wen Sun

Viaarxiv icon

Adversarial Imitation Learning via Boosting

Add code
Bookmark button
Alert button
Apr 12, 2024
Jonathan D. Chang, Dhruv Sreenivas, Yingbing Huang, Kianté Brantley, Wen Sun

Viaarxiv icon

RL for Consistency Models: Faster Reward Guided Text-to-Image Generation

Add code
Bookmark button
Alert button
Mar 25, 2024
Owen Oertell, Jonathan D. Chang, Yiyi Zhang, Kianté Brantley, Wen Sun

Viaarxiv icon

Policy-Gradient Training of Language Models for Ranking

Add code
Bookmark button
Alert button
Oct 06, 2023
Ge Gao, Jonathan D. Chang, Claire Cardie, Kianté Brantley, Thorsten Joachim

Figure 1 for Policy-Gradient Training of Language Models for Ranking
Figure 2 for Policy-Gradient Training of Language Models for Ranking
Figure 3 for Policy-Gradient Training of Language Models for Ranking
Figure 4 for Policy-Gradient Training of Language Models for Ranking
Viaarxiv icon

Learning to Generate Better Than Your LLM

Add code
Bookmark button
Alert button
Jun 20, 2023
Jonathan D. Chang, Kiante Brantley, Rajkumar Ramamurthy, Dipendra Misra, Wen Sun

Figure 1 for Learning to Generate Better Than Your LLM
Figure 2 for Learning to Generate Better Than Your LLM
Figure 3 for Learning to Generate Better Than Your LLM
Figure 4 for Learning to Generate Better Than Your LLM
Viaarxiv icon

Learning Bellman Complete Representations for Offline Policy Evaluation

Add code
Bookmark button
Alert button
Jul 12, 2022
Jonathan D. Chang, Kaiwen Wang, Nathan Kallus, Wen Sun

Figure 1 for Learning Bellman Complete Representations for Offline Policy Evaluation
Figure 2 for Learning Bellman Complete Representations for Offline Policy Evaluation
Figure 3 for Learning Bellman Complete Representations for Offline Policy Evaluation
Figure 4 for Learning Bellman Complete Representations for Offline Policy Evaluation
Viaarxiv icon

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage

Add code
Bookmark button
Alert button
Jun 14, 2021
Jonathan D. Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun

Figure 1 for Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Figure 2 for Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Figure 3 for Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Figure 4 for Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Viaarxiv icon