Picture for Yaodong Yang

Yaodong Yang

BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset

Add code
Jul 10, 2023
Figure 1 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 2 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 3 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 4 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Viaarxiv icon

Policy Space Diversity for Non-Transitive Games

Add code
Jun 29, 2023
Figure 1 for Policy Space Diversity for Non-Transitive Games
Figure 2 for Policy Space Diversity for Non-Transitive Games
Figure 3 for Policy Space Diversity for Non-Transitive Games
Figure 4 for Policy Space Diversity for Non-Transitive Games
Viaarxiv icon

Large Sequence Models for Sequential Decision-Making: A Survey

Add code
Jun 24, 2023
Viaarxiv icon

Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork

Add code
Jun 21, 2023
Viaarxiv icon

Maximum Entropy Heterogeneous-Agent Mirror Learning

Add code
Jun 19, 2023
Viaarxiv icon

Heterogeneous Value Evaluation for Large Language Models

Add code
Jun 01, 2023
Viaarxiv icon

OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research

Add code
May 16, 2023
Figure 1 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 2 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 3 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 4 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Viaarxiv icon

Heterogeneous-Agent Reinforcement Learning

Add code
Apr 19, 2023
Viaarxiv icon

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning

Add code
Apr 15, 2023
Viaarxiv icon

UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning

Add code
Apr 04, 2023
Viaarxiv icon