Alert button
Picture for Peter J. Liu

Peter J. Liu

Alert button

LiPO: Listwise Preference Optimization through Learning-to-Rank

Add code
Bookmark button
Alert button
Feb 02, 2024
Tianqi Liu, Zhen Qin, Junru Wu, Jiaming Shen, Misha Khalman, Rishabh Joshi, Yao Zhao, Mohammad Saleh, Simon Baumgartner, Jialu Liu, Peter J. Liu, Xuanhui Wang

Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Bookmark button
Alert button
Dec 22, 2023
Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel

Figure 1 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 2 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 3 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 4 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Viaarxiv icon

Self-Evaluation Improves Selective Generation in Large Language Models

Add code
Bookmark button
Alert button
Dec 14, 2023
Jie Ren, Yao Zhao, Tu Vu, Peter J. Liu, Balaji Lakshminarayanan

Viaarxiv icon

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

Add code
Bookmark button
Alert button
Nov 15, 2023
C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L Bileschi, Gamaleldin F Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, JD Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant, Peter J. Liu, Roman Novak, Yundi Qian, Noah Fiedel, Jascha Sohl-Dickstein

Viaarxiv icon

Improving Large Language Model Fine-tuning for Solving Math Problems

Add code
Bookmark button
Alert button
Oct 16, 2023
Yixin Liu, Avi Singh, C. Daniel Freeman, John D. Co-Reyes, Peter J. Liu

Viaarxiv icon

Small-scale proxies for large-scale Transformer training instabilities

Add code
Bookmark button
Alert button
Sep 25, 2023
Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith

Figure 1 for Small-scale proxies for large-scale Transformer training instabilities
Figure 2 for Small-scale proxies for large-scale Transformer training instabilities
Figure 3 for Small-scale proxies for large-scale Transformer training instabilities
Figure 4 for Small-scale proxies for large-scale Transformer training instabilities
Viaarxiv icon

Statistical Rejection Sampling Improves Preference Optimization

Add code
Bookmark button
Alert button
Sep 13, 2023
Tianqi Liu, Yao Zhao, Rishabh Joshi, Misha Khalman, Mohammad Saleh, Peter J. Liu, Jialu Liu

Figure 1 for Statistical Rejection Sampling Improves Preference Optimization
Figure 2 for Statistical Rejection Sampling Improves Preference Optimization
Figure 3 for Statistical Rejection Sampling Improves Preference Optimization
Figure 4 for Statistical Rejection Sampling Improves Preference Optimization
Viaarxiv icon

SLiC-HF: Sequence Likelihood Calibration with Human Feedback

Add code
Bookmark button
Alert button
May 17, 2023
Yao Zhao, Rishabh Joshi, Tianqi Liu, Misha Khalman, Mohammad Saleh, Peter J. Liu

Figure 1 for SLiC-HF: Sequence Likelihood Calibration with Human Feedback
Figure 2 for SLiC-HF: Sequence Likelihood Calibration with Human Feedback
Figure 3 for SLiC-HF: Sequence Likelihood Calibration with Human Feedback
Figure 4 for SLiC-HF: Sequence Likelihood Calibration with Human Feedback
Viaarxiv icon

Improving the Robustness of Summarization Models by Detecting and Removing Input Noise

Add code
Bookmark button
Alert button
Dec 20, 2022
Kundan Krishna, Yao Zhao, Jie Ren, Balaji Lakshminarayanan, Jiaming Luo, Mohammad Saleh, Peter J. Liu

Figure 1 for Improving the Robustness of Summarization Models by Detecting and Removing Input Noise
Figure 2 for Improving the Robustness of Summarization Models by Detecting and Removing Input Noise
Figure 3 for Improving the Robustness of Summarization Models by Detecting and Removing Input Noise
Figure 4 for Improving the Robustness of Summarization Models by Detecting and Removing Input Noise
Viaarxiv icon

Calibrating Sequence likelihood Improves Conditional Language Generation

Add code
Bookmark button
Alert button
Sep 30, 2022
Yao Zhao, Misha Khalman, Rishabh Joshi, Shashi Narayan, Mohammad Saleh, Peter J. Liu

Figure 1 for Calibrating Sequence likelihood Improves Conditional Language Generation
Figure 2 for Calibrating Sequence likelihood Improves Conditional Language Generation
Figure 3 for Calibrating Sequence likelihood Improves Conditional Language Generation
Figure 4 for Calibrating Sequence likelihood Improves Conditional Language Generation
Viaarxiv icon