Alert button
Picture for Juntao Dai

Juntao Dai

Alert button

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

Feb 06, 2024
Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Juntao Dai, Yaodong Yang

Viaarxiv icon

AI Alignment: A Comprehensive Survey

Nov 01, 2023
Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao

Viaarxiv icon

Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Oct 19, 2023
Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, Yaodong Yang

Figure 1 for Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Figure 2 for Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Figure 3 for Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Figure 4 for Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Viaarxiv icon

BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset

Jul 10, 2023
Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Chi Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang

Figure 1 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 2 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 3 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 4 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Viaarxiv icon

OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research

May 16, 2023
Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang

Figure 1 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 2 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 3 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 4 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Viaarxiv icon

Constrained Update Projection Approach to Safe Policy Optimization

Sep 15, 2022
Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan

Figure 1 for Constrained Update Projection Approach to Safe Policy Optimization
Figure 2 for Constrained Update Projection Approach to Safe Policy Optimization
Figure 3 for Constrained Update Projection Approach to Safe Policy Optimization
Figure 4 for Constrained Update Projection Approach to Safe Policy Optimization
Viaarxiv icon

CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning

Feb 15, 2022
Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, Gang Pan

Figure 1 for CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Figure 2 for CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Figure 3 for CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Figure 4 for CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Viaarxiv icon