Alert button
Picture for Yuqing Du

Yuqing Du

Alert button

Teaching Large Language Models to Reason with Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 07, 2024
Alex Havrilla, Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Sainbayar Sukhbaatar, Roberta Raileanu

Figure 1 for Teaching Large Language Models to Reason with Reinforcement Learning
Figure 2 for Teaching Large Language Models to Reason with Reinforcement Learning
Figure 3 for Teaching Large Language Models to Reason with Reinforcement Learning
Figure 4 for Teaching Large Language Models to Reason with Reinforcement Learning
Viaarxiv icon

Learning to Model the World with Language

Add code
Bookmark button
Alert button
Jul 31, 2023
Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan

Figure 1 for Learning to Model the World with Language
Figure 2 for Learning to Model the World with Language
Figure 3 for Learning to Model the World with Language
Figure 4 for Learning to Model the World with Language
Viaarxiv icon

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

Add code
Bookmark button
Alert button
May 25, 2023
Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee

Figure 1 for DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Figure 2 for DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Figure 3 for DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Figure 4 for DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Viaarxiv icon

Vision-Language Models as Success Detectors

Add code
Bookmark button
Alert button
Mar 13, 2023
Yuqing Du, Ksenia Konyushkova, Misha Denil, Akhil Raju, Jessica Landon, Felix Hill, Nando de Freitas, Serkan Cabi

Figure 1 for Vision-Language Models as Success Detectors
Figure 2 for Vision-Language Models as Success Detectors
Figure 3 for Vision-Language Models as Success Detectors
Figure 4 for Vision-Language Models as Success Detectors
Viaarxiv icon

Aligning Text-to-Image Models using Human Feedback

Add code
Bookmark button
Alert button
Feb 23, 2023
Kimin Lee, Hao Liu, Moonkyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu

Figure 1 for Aligning Text-to-Image Models using Human Feedback
Figure 2 for Aligning Text-to-Image Models using Human Feedback
Figure 3 for Aligning Text-to-Image Models using Human Feedback
Figure 4 for Aligning Text-to-Image Models using Human Feedback
Viaarxiv icon

Guiding Pretraining in Reinforcement Learning with Large Language Models

Add code
Bookmark button
Alert button
Feb 13, 2023
Yuqing Du, Olivia Watkins, Zihan Wang, Cédric Colas, Trevor Darrell, Pieter Abbeel, Abhishek Gupta, Jacob Andreas

Figure 1 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 2 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 3 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Figure 4 for Guiding Pretraining in Reinforcement Learning with Large Language Models
Viaarxiv icon

It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation

Add code
Bookmark button
Alert button
Feb 22, 2022
Yuqing Du, Pieter Abbeel, Aditya Grover

Figure 1 for It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation
Figure 2 for It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation
Figure 3 for It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation
Figure 4 for It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation
Viaarxiv icon

Bayesian Imitation Learning for End-to-End Mobile Manipulation

Add code
Bookmark button
Alert button
Feb 15, 2022
Yuqing Du, Daniel Ho, Alexander A. Alemi, Eric Jang, Mohi Khansari

Figure 1 for Bayesian Imitation Learning for End-to-End Mobile Manipulation
Figure 2 for Bayesian Imitation Learning for End-to-End Mobile Manipulation
Figure 3 for Bayesian Imitation Learning for End-to-End Mobile Manipulation
Figure 4 for Bayesian Imitation Learning for End-to-End Mobile Manipulation
Viaarxiv icon

Practical Imitation Learning in the Real World via Task Consistency Loss

Add code
Bookmark button
Alert button
Feb 03, 2022
Mohi Khansari, Daniel Ho, Yuqing Du, Armando Fuentes, Matthew Bennice, Nicolas Sievers, Sean Kirmani, Yunfei Bai, Eric Jang

Figure 1 for Practical Imitation Learning in the Real World via Task Consistency Loss
Figure 2 for Practical Imitation Learning in the Real World via Task Consistency Loss
Figure 3 for Practical Imitation Learning in the Real World via Task Consistency Loss
Figure 4 for Practical Imitation Learning in the Real World via Task Consistency Loss
Viaarxiv icon

Auto-Tuned Sim-to-Real Transfer

Add code
Bookmark button
Alert button
Apr 15, 2021
Yuqing Du, Olivia Watkins, Trevor Darrell, Pieter Abbeel, Deepak Pathak

Figure 1 for Auto-Tuned Sim-to-Real Transfer
Figure 2 for Auto-Tuned Sim-to-Real Transfer
Figure 3 for Auto-Tuned Sim-to-Real Transfer
Figure 4 for Auto-Tuned Sim-to-Real Transfer
Viaarxiv icon