Picture for Borong Zhang

Borong Zhang

PKU-SafeRLHF: A Safety Alignment Preference Dataset for Llama Family Models

Add code
Jun 20, 2024
Viaarxiv icon

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

Add code
Feb 06, 2024
Figure 1 for Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
Figure 2 for Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
Figure 3 for Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
Figure 4 for Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
Viaarxiv icon

AI Alignment: A Comprehensive Survey

Add code
Nov 01, 2023
Viaarxiv icon

Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Add code
Oct 19, 2023
Figure 1 for Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Figure 2 for Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Figure 3 for Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Figure 4 for Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Viaarxiv icon

Baichuan 2: Open Large-scale Language Models

Add code
Sep 20, 2023
Figure 1 for Baichuan 2: Open Large-scale Language Models
Figure 2 for Baichuan 2: Open Large-scale Language Models
Figure 3 for Baichuan 2: Open Large-scale Language Models
Figure 4 for Baichuan 2: Open Large-scale Language Models
Viaarxiv icon

Safe DreamerV3: Safe Reinforcement Learning with World Models

Add code
Jul 14, 2023
Figure 1 for Safe DreamerV3: Safe Reinforcement Learning with World Models
Figure 2 for Safe DreamerV3: Safe Reinforcement Learning with World Models
Figure 3 for Safe DreamerV3: Safe Reinforcement Learning with World Models
Figure 4 for Safe DreamerV3: Safe Reinforcement Learning with World Models
Viaarxiv icon

OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research

Add code
May 16, 2023
Figure 1 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 2 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 3 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 4 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Viaarxiv icon